Execute Strategus
Anthony G. Sena
2024-10-03
Source:vignettes/ExecuteStrategus.Rmd
ExecuteStrategus.Rmd
A Strategus study is defined by analysis specifications. These specifications describe which modules to run, with which settings. The Creating Analysis Specification vignette describes how to create analysis specifications. In this vignette, we demonstrate how to run a study once it is specified.
Creating execution settings
In addition to analysis specifications, Strategus also requires
execution settings. The execution settings specify how the
study should be executed in a specific environment, for example how to
connect to a database, and what local folders to use. Many Strategus
studies run against data in the OMOP Common Data Model (CDM), and in
this vignette we focus on this type of studies. (Other studies, such as
meta-analyses, may run against results data instead). In this example,
we will make use of the Eunomia data set which is an
OMOP CDM with simulated data used for example purposes. When running a
study against your own CDM data, you will need to specify the database
connection details for your environment. Execution settings for studies
against the CDM can be created using
createCdmExecutionSettings()
.
Creating the connection details
In this example, we first create a connectionDetails
for
Eunomia. In your environment, the connectionDetails
would
be specific to your OMOP CDM. Please see the DatabaseConnector
package documentation for more details.
library(Strategus)
library(Eunomia)
connectionDetails <- getEunomiaConnectionDetails()
Creating an execution settings object
Next, we will use Strategus
to create the CDM execution
settings:
outputFolder <- tempfile("vignetteFolder")
dir.create(outputFolder)
executionSettings <- createCdmExecutionSettings(
workDatabaseSchema = "main",
cdmDatabaseSchema = "main",
cohortTableNames = CohortGenerator::getCohortTableNames(),
workFolder = file.path(outputFolder, "work_folder"),
resultsFolder = file.path(outputFolder, "results_folder"),
minCellCount = 5
)
Finally, we can write out the execution settings to the file system to capture this information.
ParallelLogger::saveSettingsToJson(
object = executionSettings,
file.path(outputFolder, "eunomiaExecutionSettings.json")
)
Executing the study
Specifying the instantiated modules folder
For this study, we will use analysis specifications created elsewhere, and the execution settings we created earlier:
analysisSpecifications <- ParallelLogger::loadSettingsFromJson(
fileName = system.file("testdata/cdmModulesAnalysisSpecifications.json",
package = "Strategus"
)
)
executionSettings <- ParallelLogger::loadSettingsFromJson(
fileName = file.path(outputFolder, "eunomiaExecutionSettings.json")
)
And finally we execute the study:
execute(
connectionDetails = connectionDetails,
analysisSpecifications = analysisSpecifications,
executionSettings = executionSettings
)
This will first instantiate all the modules if they haven’t already been instantiated, and will then execute each module in sequence according to the analysis specifications. The results will appear in subfolders of the ‘results_folder’, as specified in the execution settings.