This function executes a large set of SQL statements against the database in OMOP CDM format to extract the data needed to perform the analysis.
getDbCohortMethodData( connectionDetails, cdmDatabaseSchema, oracleTempSchema = cdmDatabaseSchema, targetId, comparatorId, outcomeIds, studyStartDate = "", studyEndDate = "", exposureDatabaseSchema = cdmDatabaseSchema, exposureTable = "drug_era", outcomeDatabaseSchema = cdmDatabaseSchema, outcomeTable = "condition_occurrence", cdmVersion = "5", excludeDrugsFromCovariates = TRUE, firstExposureOnly = FALSE, removeDuplicateSubjects = FALSE, restrictToCommonPeriod = FALSE, washoutPeriod = 0, maxCohortSize = 0, covariateSettings )
An R object of type
The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specifiy both the database and the schema, so for example 'cdm_instance.dbo'.
For Oracle only: the name of the database schema where you want all temporary tables to be managed. Requires create/insert permissions to this database.
A unique identifier to define the target cohort. If exposureTable = DRUG_ERA, targetId is a CONCEPT_ID and all descendant concepts within that CONCEPT_ID will be used to define the cohort. If exposureTable <> DRUG_ERA, targetId is used to select the cohort_concept_id in the cohort-like table.
A unique identifier to define the comparator cohort. If exposureTable = DRUG_ERA, comparatorId is a CONCEPT_ID and all descendant concepts within that CONCEPT_ID will be used to define the cohort. If exposureTable <> DRUG_ERA, comparatorId is used to select the cohort_concept_id in the cohort-like table.
A list of cohort_definition_ids used to define outcomes.
A calendar date specifying the minimum date that a cohort index date can appear. Date format is 'yyyymmdd'.
A calendar date specifying the maximum date that a cohort index date can appear. Date format is 'yyyymmdd'. Important: the study end data is also used to truncate risk windows, meaning no outcomes beyond the study end date will be considered.
The name of the database schema that is the location where the exposure data used to define the exposure cohorts is available. If exposureTable = DRUG_ERA, exposureDatabaseSchema is not used by assumed to be cdmSchema. Requires read permissions to this database.
The tablename that contains the exposure cohorts. If exposureTable <> DRUG_ERA, then expectation is exposureTable has format of COHORT table: cohort_concept_id, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
The name of the database schema that is the location where the data used to define the outcome cohorts is available. If exposureTable = CONDITION_ERA, exposureDatabaseSchema is not used by assumed to be cdmSchema. Requires read permissions to this database.
The tablename that contains the outcome cohorts. If outcomeTable <> CONDITION_OCCURRENCE, then expectation is outcomeTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
Define the OMOP CDM version used: currently support "4" and "5".
Should the target and comparator drugs (and their descendant concepts) be excluded from the covariates? Note that this will work if the drugs are actualy drug concept IDs (and not cohort IDs).
Should only the first exposure per subject be included? Note
that this is typically done in the
Remove subjects that are in both the target and comparator
cohort? See details for allowed values.N ote that this is typically done in the
Restrict the analysis to the period when both treatments are observed?
The mininum required continuous observation time prior to index
date for a person to be included in the cohort. Note that this
is typically done in the
If either the target or the comparator cohort is larger than
this number it will be sampled to this size.
An object of type
Returns an object of type
cohortMethodData, containing information on the cohorts, their
outcomes, and baseline covariates. Information about multiple outcomes can be captured at once for
efficiency reasons. This object is a list with the following components:
A data frame listing the outcomes per person, including the time to event, and the outcome id. Outcomes are not yet filtered based on risk window, since this is done at a later stage.
A data frame listing the persons in each cohort, listing their exposure status as well as the time to the end of the observation period and time to the end of the cohort (usually the end of the exposure era).
An ffdf object listing the baseline covariates per person in the two cohorts. This is done using a sparse representation: covariates with a value of 0 are omitted to save space.
An ffdf object describing the covariates that have been extracted.
A list of objects with information on how the cohortMethodData object was constructed.
Based on the arguments, the treatment and comparator cohorts are retrieved, as well as outcomes
occurring in exposed subjects. The treatment and comparator cohorts can be identified using the
drug_era table, or through user-defined cohorts in a cohort table either inside the CDM instance or
in a separate schema. Similarly, outcomes are identified using the condition_era table or through
user-defined cohorts in a cohort table either inside the CDM instance or in a separate schema.
Covariates are automatically extracted from the appropriate tables within the CDM. Important: The
target and comparator drug must not be included in the covariates, including any descendant
concepts. If the
comparatorId arguments represent real concept IDs, you
can set the
excludeDrugsFromCovariates argument to TRUE and automatically the drugs and
their descendants will be excluded from the covariates. However, if the
comparatorId arguments do not represent concept IDs, you will need to manually add the drugs
and descendants to the
excludedCovariateConceptIds of the
removeduplicateSubjects argument can have one of the following values:
Do not remove subjects that appear in both target and comparator cohort
When a subjects appear in both target and comparator cohort, only keep whichever cohort is first in time.
Remove subjects that appear in both target and comparator cohort completely from the analysis."