Synthesize positive controls
synthesizePositiveControls(
connectionDetails,
cdmDatabaseSchema,
oracleTempSchema = NULL,
tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
exposureDatabaseSchema = cdmDatabaseSchema,
exposureTable = "drug_era",
outcomeDatabaseSchema = cdmDatabaseSchema,
outcomeTable = "cohort",
outputDatabaseSchema = outcomeDatabaseSchema,
outputTable = outcomeTable,
createOutputTable = FALSE,
exposureOutcomePairs,
modelType = "poisson",
minOutcomeCountForModel = 100,
minOutcomeCountForInjection = 25,
minModelCount = 5,
covariateSettings = FeatureExtraction::createCovariateSettings(useDemographicsAgeGroup
= TRUE, useDemographicsGender = TRUE, useDemographicsIndexYear = TRUE,
useDemographicsIndexMonth = TRUE, useConditionGroupEraLongTerm = TRUE,
useDrugGroupEraLongTerm = TRUE, useProcedureOccurrenceLongTerm = TRUE,
useMeasurementLongTerm = TRUE, useObservationLongTerm = TRUE, useCharlsonIndex =
TRUE, useDcsi = TRUE, useChads2Vasc = TRUE, longTermStartDays = 365, endDays = 0),
prior = Cyclops::createPrior("laplace", exclude = 0, useCrossValidation = TRUE),
control = Cyclops::createControl(cvType = "auto", startingVariance = 0.1, seed = 1,
resetCoefficients = TRUE, noiseLevel = "quiet", threads = 10),
firstExposureOnly = FALSE,
washoutPeriod = 183,
riskWindowStart = 0,
riskWindowEnd = 0,
endAnchor = "cohort end",
addIntentToTreat = FALSE,
firstOutcomeOnly = FALSE,
removePeopleWithPriorOutcomes = FALSE,
maxSubjectsForModel = 1e+05,
effectSizes = c(1, 1.25, 1.5, 2, 4),
precision = 0.01,
outputIdOffset = 1000,
workFolder = "./SignalInjectionTemp",
cdmVersion = "5",
modelThreads = 1,
generationThreads = 1
)
An R object of type ConnectionDetails
created using
the function createConnectionDetails
in the
DatabaseConnector
package.
Name of database schema that contains OMOP CDM and vocabulary.
DEPRECATED: use `tempEmulationSchema` instead.
Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.
The name of the database schema that is the location where the exposure data used to define the exposure cohorts is available. If exposureTable = DRUG_ERA, exposureDatabaseSchema is not used by assumed to be cdmSchema. Requires read permissions to this database.
The table name that contains the exposure cohorts. If exposureTable <> DRUG_ERA, then expectation is exposureTable has format of COHORT table: cohort_concept_id, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
The name of the database schema that is the location where the data used to define the outcome cohorts is available. If exposureTable = CONDITION_ERA, exposureDatabaseSchema is not used by assumed to be cdmSchema. Requires read permissions to this database.
The table name that contains the outcome cohorts. When the table name is not CONDITION_ERA This table is expected to have the same format as the COHORT table: SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE, COHORT_CONCEPT_ID (CDM v4) or COHORT_DEFINITION_ID (CDM v5 and higher).
The name of the database schema that is the location of the tables containing the new outcomesRequires write permissions to this database.
The name of the table names that will contain the generated outcome cohorts.
Should the output table be created prior to inserting the outcomes? If TRUE and the tables already exists, it will first be deleted. If FALSE, the table is assumed to exist and the outcomes will be inserted. Any existing outcomes with the same IDs will first be deleted.
A data frame with at least two columns:
"exposureId" containing the drug_concept_ID or cohort_concept_id of the exposure variable
"outcomeId" containing the condition_concept_ID or cohort_concept_id of the outcome variable
Can be either "poisson" or "survival"
Minimum number of outcome events required to build a model.
Minimum number of outcome events required to inject a signal.
Minimum number of negative controls having enough outcomes to fit an outcome model.
An object of type covariateSettings
as created using
the createCovariateSettings
function in the
FeatureExtraction
package.
The prior used to fit the outcome model. See
createPrior
for details.
The control object used to control the cross-validation used
to determine the hyperparameters of the prior (if
applicable). See createControl
for
details.
Should signals be injected only for the first exposure? (ie. assuming an acute effect)
Number of days at the start of observation for which no signals will be injected, but will be used to determine whether exposure or outcome is the first one, and for extracting covariates to build the outcome model.
The start of the risk window relative to the start of the exposure (in days). When 0, risk is assumed to start on the first day of exposure.
The end of the risk window (in days) relative to the endAnchor.
The anchor point for the end of the risk window. Can be "cohort start" or "cohort end".
If true, the signal will not only be injected in the primary
time at risk, but also after the time at risk (up until the
obseration period end). In both time periods, the target
effect size will be enforced. This allows the same positive
control synthesis to be used in both on treatment and
intent-to-treat analysis variants. However, this will
preclude the controls to be used in self-controlled designs
that consider the time after exposure. Requires
firstExposureOnly = TRUE
.
Should only the first outcome per person be considered when modeling the outcome?
Remove people with prior outcomes?
Maximum number of people used to fit an outcome model.
A numeric vector of effect sizes that should be inserted.
The allowed ratio between target and injected signal size.
What should be the first new outcome ID that is to be created?
Path to a folder where intermediate data will be stored.
Define the OMOP CDM version used: currently support "4" and "5".
Number of parallel threads to use when fitting outcome models.
Number of parallel threads to use when generating outcomes.
A data.frame listing all the drug-pairs in combination with requested effect sizes and the real inserted effect size (might be different from the requested effect size because of sampling error).
This function will insert additional outcomes for a given set of drug-outcome pairs. It is assumed
that these drug-outcome pairs represent negative controls, so the true relative risk before
inserting any outcomes should be 1. There are two models for inserting the outcomes during the
specified risk window of the drug: a Poisson model assuming multiple outcomes could occurr during a
single exposure, and a survival model considering only one outcome per exposure.
It is possible to use bulk import to insert the generated outcomes in the database. This requires
the environmental variable 'USE_MPP_BULK_LOAD' to be set to 'TRUE'. See
?DatabaseConnector::insertTable
for details on how to configure the bulk upload.
Schuemie MJ, Hripcsak G, Ryan PB, Madigan D, Suchard MA. Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data. Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2571-2577.