runSelfControlledCohort generates population-level estimation by comparing exposed and unexposed time among exposed cohort.

runSelfControlledCohort(
  connectionDetails = NULL,
  cdmDatabaseSchema,
  connection = NULL,
  cdmVersion = 5,
  tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
  oracleTempSchema = NULL,
  exposureIds = NULL,
  outcomeIds = NULL,
  exposureDatabaseSchema = cdmDatabaseSchema,
  exposureTable = "drug_era",
  outcomeDatabaseSchema = cdmDatabaseSchema,
  outcomeTable = "condition_era",
  firstExposureOnly = TRUE,
  firstOutcomeOnly = TRUE,
  minAge = "",
  maxAge = "",
  studyStartDate = "",
  studyEndDate = "",
  addLengthOfExposureExposed = TRUE,
  riskWindowStartExposed = 1,
  riskWindowEndExposed = 30,
  addLengthOfExposureUnexposed = TRUE,
  riskWindowEndUnexposed = -1,
  riskWindowStartUnexposed = -30,
  hasFullTimeAtRisk = FALSE,
  washoutPeriod = 0,
  followupPeriod = 0,
  computeTarDistribution = FALSE,
  computeThreads = 1,
  riskWindowsTable = "#risk_windows",
  resultsTable = "#results",
  resultsDatabaseSchema = NULL,
  postProcessFunction = NULL,
  postProcessArgs = list(),
  returnEstimates = TRUE
)

Arguments

connectionDetails

An R object of type connectionDetails created using the function createConnectionDetails in the DatabaseConnector package.

cdmDatabaseSchema

Name of database schema that contains the OMOP CDM and vocabulary.

connection

DatabaseConnector connection instance

cdmVersion

Define the OMOP CDM version used: currently support "4" and "5".

tempEmulationSchema

Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.

oracleTempSchema

For Oracle only: the name of the database schema where you want all temporary tables to be managed. Requires create/insert permissions to this database.

exposureIds

A vector containing the drug_concept_ids or cohort_definition_ids of the exposures of interest. If empty, all exposures in the exposure table will be included.

outcomeIds

The condition_concept_ids or cohort_definition_ids of the outcomes of interest. If empty, all the outcomes in the outcome table will be included.

exposureDatabaseSchema

The name of the database schema that is the location where the exposure data used to define the exposure cohorts is available. If exposureTable = DRUG_ERA, exposureDatabaseSchema is not used by assumed to be cdmSchema. Requires read permissions to this database.

exposureTable

The tablename that contains the exposure cohorts. If exposureTable <> DRUG_ERA, then expectation is exposureTable has format of COHORT table: cohort_concept_id, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.

outcomeDatabaseSchema

The name of the database schema that is the location where the data used to define the outcome cohorts is available. If exposureTable = CONDITION_ERA, exposureDatabaseSchema is not used by assumed to be cdmSchema. Requires read permissions to this database.

outcomeTable

The tablename that contains the outcome cohorts. If outcomeTable <> CONDITION_OCCURRENCE, then expectation is outcomeTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.

firstExposureOnly

If TRUE, only use first occurrence of each drug concept id for each person

firstOutcomeOnly

If TRUE, only use first occurrence of each condition concept id for each person.

minAge

Integer for minimum allowable age.

maxAge

Integer for maximum allowable age.

studyStartDate

Date for minimum allowable data for index exposure. Date format is 'yyyymmdd'.

studyEndDate

Date for maximum allowable data for index exposure. Date format is 'yyyymmdd'.

addLengthOfExposureExposed

If TRUE, use the duration from drugEraStart -> drugEraEnd as part of timeAtRisk.

riskWindowStartExposed

Integer of days to add to drugEraStart for start of timeAtRisk (0 to include index date, 1 to start the day after).

riskWindowEndExposed

Additional window to add to end of exposure period (if addLengthOfExposureExposed = TRUE, then add to exposure end date, else add to exposure start date).

addLengthOfExposureUnexposed

If TRUE, use the duration from exposure start -> exposure end as part of timeAtRisk looking back before exposure start.

riskWindowEndUnexposed

Integer of days to add to exposure start for end of timeAtRisk (0 to include index date, -1 to end the day before).

riskWindowStartUnexposed

Additional window to add to start of exposure period (if addLengthOfExposureUnexposed = TRUE, then add to exposure end date, else add to exposure start date).

hasFullTimeAtRisk

If TRUE, restrict to people who have full time-at-risk exposed and unexposed.

washoutPeriod

Integer to define required time observed before exposure start.

followupPeriod

Integer to define required time observed after exposure start.

computeTarDistribution

If TRUE, computer the distribution of time-at-risk and average absolute time between treatment and outcome. Note, may add significant computation time on some database engines.

computeThreads

Number of parallel threads for computing IRRs with exact confidence intervals.

riskWindowsTable

String: optionally store the risk windows in a (non-temporary) table.

resultsTable

String: optionally store the summary results (number exposed/ unexposed patients per outcome-exposure pair) in a (non-temporary) table. Note that this table does not store the rate ratios, only the values required to calculate rate ratios.

resultsDatabaseSchema

Schema to oputput results to. Ignored if resultsTable and riskWindowsTable are temporary.

postProcessFunction

Callback function to handle batches of data. Useful for massive result sets that overflow system memory. See example.

postProcessArgs

Arguments for post processing function callback.

returnEstimates

Boolean opt to not return estimates, only useful in the case where postProcessFunction is used

Value

An object of type sccResults containing the results of the analysis.

Details

Population-level estimation method that estimates incidence rate comparison of exposed/unexposed time within an exposed cohort. If multiple exposureIds and outcomeIds are provided, estimates will be generated for every combination of exposure and outcome.

References

Ryan PB, Schuemie MJ, Madigan D.Empirical performance of a self-controlled cohort method: lessons for developing a risk identification and analysis system. Drug Safety 36 Suppl1:S95-106, 2013

Examples

if (FALSE) {
connectionDetails <- createConnectionDetails(dbms = "sql server",
                                             server = "RNDUSRDHIT07.jnj.com")
sccResult <- runSelfControlledCohort(connectionDetails,
                                     cdmDatabaseSchema = "cdm_truven_mdcr.dbo",
                                     exposureIds = c(767410, 1314924, 907879),
                                     outcomeIds = 444382,
                                     outcomeTable = "condition_era")

# Using a callback function that writes data to a csv file and not store in memory
csvFileName <- "D:/path/to/output.csv"
writeSccData <- function(data, position, csvFileName) {
  vroom::vroom_write(data, csvFileName, delim = ",", append = position != 1, na = "")
}

runSelfControlledCohort(connectionDetails,
                        cdmDatabaseSchema = "cdm_truven_mdcr.dbo",
                        exposureIds = c(767410, 1314924, 907879),
                        outcomeIds = 444382,
                        outcomeTable = "condition_era",
                        postProcessFunction = writeSccData,
                        postProcessArgs = list(csvFileName = csvFileName),
                        returnEstimates = FALSE)
}