R/CreateReferenceSetCohorts.R
synthesizeReferenceSetPositiveControls.Rd
Synthesize positive controls for reference set
synthesizeReferenceSetPositiveControls(
connectionDetails,
oracleTempSchema = NULL,
tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
cdmDatabaseSchema,
exposureDatabaseSchema = cdmDatabaseSchema,
exposureTable = "drug_era",
outcomeDatabaseSchema = cdmDatabaseSchema,
outcomeTable = "cohort",
referenceSet = "ohdsiMethodsBenchmark",
maxCores = 1,
workFolder,
summaryFileName = file.path(workFolder, "allControls.csv")
)
An R object of type ConnectionDetails
created using the
function createConnectionDetails
in the
DatabaseConnector
package.
DEPRECATED: use `tempEmulationSchema` instead.
Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.
A database schema containing health care data in the OMOP Commond Data Model. Note that for SQL Server, botth the database and schema should be specified, e.g. 'cdm_schema.dbo'
The name of the database schema that is the location where the exposure data used to define the exposure cohorts is available. If exposureTable = DRUG_ERA, exposureDatabaseSchema is not used and assumed to be cdmDatabaseSchema. Requires read permissions to this database.
The tablename that contains the exposure cohorts. If exposureTable <> DRUG_ERA, then expectation is exposureTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
The database schema where the target outcome table is located. Note that for SQL Server, both the database and schema should be specified, e.g. 'cdm_schema.dbo'
The name of the table where the outcomes will be stored.
The name of the reference set for which positive controls need to be synthesized. Currently supported are "ohdsiMethodsBenchmark" and "ohdsiDevelopment".
How many parallel cores should be used? If more cores are made available this can speed up the analyses.
Name of local folder to place intermediary results; make sure to use forward slashes (/). Do not use a folder on a network drive since this greatly impacts performance.
The name of the CSV file where to store the summary of the final set of positive and negative controls.
This function will synthesize positive controls for a given reference set based on the real negative controls. Data from the database will be used to fit outcome models for each negative control outcome, and these models will be used to sample additional synthetic outcomes during exposure to increase the true hazard ratio. The positive control outcome cohorts will be stored in the same database table as the negative control outcome cohorts. A summary file will be created listing all positive and negative controls. This list should then be used as input for the method under evaluation.