Execute Strategus
Anthony G. Sena
2024-10-29
Source:vignettes/ExecuteStrategus.Rmd
ExecuteStrategus.Rmd
A Strategus study is defined by analysis specifications. These specifications describe which modules to run, with which settings. The Creating Analysis Specification vignette describes how to create analysis specifications. In this vignette, we demonstrate how to run a study defined by an analysis specification.
Creating execution settings
Strategus execution requires you to specify execution
settings. The execution settings specify how the study should be
executed in a specific environment, for example how to connect to a
database, and what local folders to use. Many Strategus studies run
against data in the OMOP Common Data Model (CDM), and in this vignette
we focus on this type of studies. (Other studies, such as meta-analyses,
may run against results data instead). In this example, we will make use
of the Eunomia data set
which is an OMOP CDM with simulated data used for example purposes. When
running a study against your own CDM data, you will need to specify the
database connection details for your environment. Execution settings for
studies against the CDM can be created using
createCdmExecutionSettings()
.
Creating the connection details
In this example, we first create a connectionDetails
for
Eunomia. In your environment, the connectionDetails
would
be specific to your OMOP CDM. Please see the DatabaseConnector
package documentation for more details.
library(Strategus)
library(Eunomia)
connectionDetails <- getEunomiaConnectionDetails()
Creating an execution settings object
Next, we will use Strategus
to create the CDM execution
settings:
outputFolder <- tempfile("vignetteFolder")
dir.create(outputFolder)
executionSettings <- createCdmExecutionSettings(
workDatabaseSchema = "main",
cdmDatabaseSchema = "main",
cohortTableNames = CohortGenerator::getCohortTableNames(),
workFolder = file.path(outputFolder, "work_folder"),
resultsFolder = file.path(outputFolder, "results_folder"),
minCellCount = 5
)
Finally, we can write out the execution settings to the file system to capture this information.
ParallelLogger::saveSettingsToJson(
object = executionSettings,
file.path(outputFolder, "eunomiaExecutionSettings.json")
)
Executing the study
For this study, we will use an analysis specifications created for
testing Strategus
, and the execution settings we created
earlier:
analysisSpecifications <- ParallelLogger::loadSettingsFromJson(
fileName = system.file("testdata/cdmModulesAnalysisSpecifications.json",
package = "Strategus"
)
)
executionSettings <- ParallelLogger::loadSettingsFromJson(
fileName = file.path(outputFolder, "eunomiaExecutionSettings.json")
)
And finally we execute the study:
execute(
connectionDetails = connectionDetails,
analysisSpecifications = analysisSpecifications,
executionSettings = executionSettings
)
This will first instantiate all the modules if they haven’t already been instantiated, and will then execute each module in sequence according to the analysis specifications. The results will appear in sub folders of the ‘results_folder’, as specified in the execution settings.
Once the analysis is complete, you can review the study results. For more information see the Working with Results article.