Synthesize positive controls for reference set

synthesizeReferenceSetPositiveControls(
  connectionDetails,
  oracleTempSchema = NULL,
  tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
  cdmDatabaseSchema,
  exposureDatabaseSchema = cdmDatabaseSchema,
  exposureTable = "drug_era",
  outcomeDatabaseSchema = cdmDatabaseSchema,
  outcomeTable = "cohort",
  referenceSet = "ohdsiMethodsBenchmark",
  maxCores = 1,
  workFolder,
  summaryFileName = file.path(workFolder, "allControls.csv")
)

Arguments

connectionDetails

An R object of type ConnectionDetails created using the function createConnectionDetails in the DatabaseConnector package.

oracleTempSchema

DEPRECATED: use `tempEmulationSchema` instead.

tempEmulationSchema

Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.

cdmDatabaseSchema

A database schema containing health care data in the OMOP Commond Data Model. Note that for SQL Server, botth the database and schema should be specified, e.g. 'cdm_schema.dbo'

exposureDatabaseSchema

The name of the database schema that is the location where the exposure data used to define the exposure cohorts is available. If exposureTable = DRUG_ERA, exposureDatabaseSchema is not used and assumed to be cdmDatabaseSchema. Requires read permissions to this database.

exposureTable

The tablename that contains the exposure cohorts. If exposureTable <> DRUG_ERA, then expectation is exposureTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.

outcomeDatabaseSchema

The database schema where the target outcome table is located. Note that for SQL Server, both the database and schema should be specified, e.g. 'cdm_schema.dbo'

outcomeTable

The name of the table where the outcomes will be stored.

referenceSet

The name of the reference set for which positive controls need to be synthesized. Currently supported are "ohdsiMethodsBenchmark" and "ohdsiDevelopment".

maxCores

How many parallel cores should be used? If more cores are made available this can speed up the analyses.

workFolder

Name of local folder to place intermediary results; make sure to use forward slashes (/). Do not use a folder on a network drive since this greatly impacts performance.

summaryFileName

The name of the CSV file where to store the summary of the final set of positive and negative controls.

Details

This function will synthesize positive controls for a given reference set based on the real negative controls. Data from the database will be used to fit outcome models for each negative control outcome, and these models will be used to sample additional synthetic outcomes during exposure to increase the true hazard ratio. The positive control outcome cohorts will be stored in the same database table as the negative control outcome cohorts. A summary file will be created listing all positive and negative controls. This list should then be used as input for the method under evaluation.