Runs the cohort diagnostics on all (or a subset of) the cohorts instantiated using the
ROhdsiWebApi::insertCohortDefinitionSetInPackage
function. Assumes the cohorts have already been instantiated.
Characterization:
If runTemporalCohortCharacterization argument is TRUE, then the following default covariateSettings object will be created
using RFeatureExtraction::createTemporalCovariateSettings
Alternatively, a covariate setting object may be created using the above as an example.
runCohortDiagnostics(
packageName = NULL,
cohortToCreateFile = "settings/CohortsToCreate.csv",
cohortDefinitionSet = NULL,
baseUrl = NULL,
cohortSetReference = NULL,
connectionDetails = NULL,
connection = NULL,
cdmDatabaseSchema,
oracleTempSchema = NULL,
tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
cohortDatabaseSchema,
vocabularyDatabaseSchema = cdmDatabaseSchema,
cohortTable = "cohort",
cohortTableNames = CohortGenerator::getCohortTableNames(cohortTable = cohortTable),
cohortIds = NULL,
inclusionStatisticsFolder = NULL,
exportFolder,
databaseId,
databaseName = databaseId,
databaseDescription = databaseId,
cdmVersion = 5,
runInclusionStatistics = TRUE,
runIncludedSourceConcepts = TRUE,
runOrphanConcepts = TRUE,
runTimeDistributions = TRUE,
runVisitContext = TRUE,
runBreakdownIndexEvents = TRUE,
runIncidenceRate = TRUE,
runTimeSeries = FALSE,
runCohortOverlap = TRUE,
runCohortCharacterization = TRUE,
covariateSettings = createDefaultCovariateSettings(),
runTemporalCohortCharacterization = TRUE,
temporalCovariateSettings = createTemporalCovariateSettings(useConditionOccurrence =
TRUE, useDrugEraStart = TRUE, useProcedureOccurrence = TRUE, useMeasurement = TRUE,
temporalStartDays = c(-365, -30, 0, 1, 31), temporalEndDays = c(-31, -1, 0, 30, 365)),
minCellCount = 5,
incremental = FALSE,
incrementalFolder = file.path(exportFolder, "incremental")
)
The name of the package containing the cohort definitions. Can be left NULL if
baseUrl
and cohortSetReference
have been specified.
The location of the cohortToCreate file within the package. Is ignored if
baseUrl
and cohortSetReference
have been specified. The
cohortToCreateFile must be .csv file that is expected to be read into
a dataframe object identical to requirements for cohortSetReference
argument.
This csv file is expected to be encoded in either ASCII or UTF-8, if not, an
error message will be displayed and process stopped.
Data.frame of cohorts must include columns cohortId, cohortName, json, sql
The base URL for the WebApi instance, for example:
"http://server.org:80/WebAPI". Can be left NULL if
packageName
and cohortToCreateFile
have been specified.
A data frame with four columns, as described in the details. Can be left NULL if
packageName
and cohortToCreateFile
have been specified.
An object of type connectionDetails
as created using the
createConnectionDetails
function in the
DatabaseConnector package. Can be left NULL if connection
is
provided.
An object of type connection
as created using the
connect
function in the
DatabaseConnector package. Can be left NULL if connectionDetails
is provided, in which case a new connection will be opened at the start
of the function, and closed when the function finishes.
Schema name where your patient-level data in OMOP CDM format resides. Note that for SQL Server, this should include both the database and schema name, for example 'cdm_data.dbo'.
DEPRECATED by DatabaseConnector: use tempEmulationSchema
instead.
Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.
Schema name where your cohort table resides. Note that for SQL Server, this should include both the database and schema name, for example 'scratch.dbo'.
Schema name where your OMOP vocabulary data resides. This is commonly the same as cdmDatabaseSchema. Note that for SQL Server, this should include both the database and schema name, for example 'vocabulary.dbo'.
Name of the cohort table.
Cohort Table names used by CohortGenerator package
Optionally, provide a subset of cohort IDs to restrict the diagnostics to.
The folder where the inclusion rule statistics are stored. Can be
left NULL if runInclusionStatistics = FALSE
.
The folder where the output will be exported to. If this folder does not exist it will be created.
A short string for identifying the database (e.g. 'Synpuf').
The full name of the database. If NULL, defaults to databaseId.
A short description (several sentences) of the database. If NULL, defaults to databaseId.
The version of the OMOP CDM. Default 5. (Note: only 5 is supported.)
Generate and export statistic on the cohort inclusion rules?
Generate and export the source concepts included in the cohorts?
Generate and export potential orphan concepts?
Generate and export cohort time distributions?
Generate and export index-date visit context?
Generate and export the breakdown of index events?
Generate and export the cohort incidence rates?
Generate and export the cohort prevalence rates?
Generate and export the cohort overlap? Overlaps are checked within cohortIds that have the same phenotype ID sourced from the CohortSetReference or cohortToCreateFile.
Generate and export the cohort characterization? Only records with values greater than 0.0001 are returned.
Either an object of type covariateSettings
as created using one of
the createCovariateSettings function in the FeatureExtraction package, or a list
of such objects.
Generate and export the temporal cohort characterization? Only records with values greater than 0.001 are returned.
Either an object of type covariateSettings
as created using one of
the createTemporalCovariateSettings function in the FeatureExtraction package, or a list
of such objects.
The minimum cell count for fields contains person counts or fractions.
Create only cohort diagnostics that haven't been created before?
If incremental = TRUE
, specify a folder where records are kept
of which cohort diagnostics has been executed.
Currently two ways of executing this function are supported, either
(1) [Package Mode] embedded in a study package, assuming the cohort definitions are stored in that package using the ROhdsiWebApi::insertCohortDefinitionSetInPackage
, or
(2) [WebApi Mode] By using a WebApi interface to retrieve the cohort definitions.
When using this function in Package Mode: Use the packageName
and cohortToCreateFile
to specify
the name of the study package, and the name of the cohortToCreate file within that package, respectively
When using this function in WebApi Mode: use the baseUrl
and cohortSetReference
to specify how to
connect to the WebApi, and which cohorts to fetch, respectively.
Note: if the parameters for both Package Mode and WebApi Mode are provided, then Package mode is preferred.
The cohortSetReference
argument must be a data frame with the following columns:
The cohort Id is the id used to identify a cohort definition. This is required to be unique. It will be used to create file names. It is recommended to be (referrentConceptId * 1000) + a number between 3 to 999
Cohort Id in the webApi/atlas instance. It is a required field to run Cohort Diagnostics in WebApi mode. It is discarded in package mode.
The full name of the cohort. This will be shown in the Shiny app.
A human understandable brief description of the cohort definition. This logic does not have to a fully specified description of the cohort definition, but should provide enough context to help user understand the meaning of the cohort definition
A standard omop concept id that serves as the referent phenotype definition for the cohort Id (optional)