Create the evaluation cohort

createEvaluationCohort(
  connectionDetails,
  oracleTempSchema = NULL,
  tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
  phenotype,
  analysisName,
  runDateTime,
  databaseId,
  xSpecCohortId,
  daysFromxSpec = 0,
  xSensCohortId,
  prevalenceCohortId,
  caseCohortId,
  caseFirstOccurrenceOnly,
  xSpecCohortSize = 5000,
  cdmDatabaseSchema,
  cohortDatabaseSchema,
  cohortTable,
  workDatabaseSchema,
  covariateSettings = createDefaultCovariateSettings(excludedCovariateConceptIds = c(),
    addDescendantsToExclude = TRUE),
  modelPopulationCohortId = 0,
  modelPopulationCohortIdStartDay = 0,
  modelPopulationCohortIdEndDay = 0,
  inclusionEvaluationCohortId = 0,
  inclusionEvaluationDaysFromStart = 0,
  inclusionEvaluationDaysFromEnd = 0,
  duringInclusionEvaluationOnly = FALSE,
  exclusionEvaluationCohortId = 0,
  exclusionEvaluationDaysFromStart = 0,
  exclusionEvaluationDaysFromEnd = 0,
  priorModelToUse = NULL,
  minimumOffsetFromStart = 365,
  minimumOffsetFromEnd = 365,
  modelBaseSampleSize = 25000,
  baseSampleSize = 2e+06,
  lowerAgeLimit = 0,
  upperAgeLimit = 120,
  visitLength = 0,
  visitType = c(9201, 9202, 9203, 262, 581477),
  gender = c(8507, 8532),
  race = 0,
  ethnicity = 0,
  startDate = "19001010",
  endDate = "21000101",
  falsePositiveNegativeSubjects = 10,
  cdmVersion = "5",
  outFolder = getwd(),
  exportFolder,
  modelId = "main",
  evaluationCohortId = "main",
  excludeModelFromEvaluation = FALSE,
  randomVisitTable = "",
  removeSubjectsWithFutureDates = TRUE,
  saveEvaluationCohortPlpData = FALSE
)

Arguments

connectionDetails: connectionDetails created using the function createConnectionDetails in the DatabaseConnector package.
oracleTempSchema: DEPRECATED: use tempEmulationSchema instead.
tempEmulationSchema: Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.
phenotype: Name of the phenotype for analysis
analysisName: Name of the analysis
runDateTime: Starting date and time of the PheValuator run
databaseId: Name of the database in the analysis
xSpecCohortId: The number of the "extremely specific (xSpec)" cohort definition id in the cohort table (for noisy positives).
daysFromxSpec: Number of days allowed from xSpec condition until analyzed visit
xSensCohortId: The number of the "extremely sensitive (xSens)" cohort definition id in the cohort table (for noisy negatives).
prevalenceCohortId: The number of the cohort definition id to determine the disease prevalence.
caseCohortId: The number of the cohort definition id to determine cases in the evaluation cohort
caseFirstOccurrenceOnly: Set to true if only the first occurrence per subject in the case cohort is to be used
xSpecCohortSize: The recommended xSpec sample size to use in model (default = NULL)
cdmDatabaseSchema: The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specifiy both the database and the schema, so for example 'cdm_instance.dbo'.
cohortDatabaseSchema: The name of the database schema that is the location where the cohort data used to define the at risk cohort is available. Requires read permissions to this database.
cohortTable: The tablename that contains the at risk cohort. The expectation is cohortTable has format of COHORT table: cohort_concept_id, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
workDatabaseSchema: The name of the database schema that is the location where a table can be created and afterwards removed. Requires write permissions to this database.
covariateSettings: A covariateSettings object as generated using createCovariateSettings().
modelPopulationCohortId: The number of the cohort to be used as a base population for the model. If set to 0, the entire database population will be used.
modelPopulationCohortIdStartDay: The number of days relative to the mainPopulationCohortId cohort start date to begin including visits.
modelPopulationCohortIdEndDay: The number of days relative to the mainPopulationCohortId cohort start date to end including visits.
inclusionEvaluationCohortId: The number of the cohort of the population to be used to designate which visits are eligible to be in the evaluation cohort
inclusionEvaluationDaysFromStart: The number of days from the cohort start date of the inclusionEvaluationCohortId to start eligible included visits
inclusionEvaluationDaysFromEnd: The number of days from the cohort start date of the inclusionEvaluationCohortId to end eligible included visits
duringInclusionEvaluationOnly: Only include visits that are within the cohort start and end dates
exclusionEvaluationCohortId: The number of the cohort of the population to be used to designate which visits are NOT eligible to be in the evaluation cohort
exclusionEvaluationDaysFromStart: The number of days from the cohort start date of the exclusionEvaluationCohortId to start ineligible included visits
exclusionEvaluationDaysFromEnd: The number of days from the cohort start date of the exclusionEvaluationCohortId to end ineligible included visits
priorModelToUse: folder where a previously developed model to use in analysis will be found
minimumOffsetFromStart: Minimum number of days to offset for the analysis visit from the start of the observation period
minimumOffsetFromEnd: Minimum number of days to offset for the analysis visit from the end of the observation period
modelBaseSampleSize: The number of non-xSpec subjects to include in the model
baseSampleSize: The maximum number of subjects in the evaluation cohort.
lowerAgeLimit: The lower age for subjects in the model.
upperAgeLimit: The upper age for subjects in the model.
visitLength: The minimum length of index visit for acute outcomes.
visitType: The concept_id for the visit type.
gender: The gender(s) to be included.
race: The race(s) to be included.
ethnicity: The ethnicity(s) to be included.
startDate: The starting date for including subjects in the model.
endDate: The ending date for including subjects in the model.
falsePositiveNegativeSubjects: Number of subjects to include for evaluating false positives and negatives
cdmVersion: The CDM version of the database.
outFolder: The folder where the output files will be written.
exportFolder: The folder where the csv output files will be written.
modelId: A string used to generate the file names for this model.
evaluationCohortId: A string used to generate the file names for this evaluation cohort.
excludeModelFromEvaluation: Should subjects used in the model be excluded from the evaluation cohort?
randomVisitTable: Table stored in work directory with pre-selected random visits in format of visit_occurrence table
removeSubjectsWithFutureDates: For buggy data with data in the future: ignore subjects with dates in the future?
saveEvaluationCohortPlpData: Should the large PLP file for the evaluation cohort be saved? To be used for debugging purposes.

Details

Fits a diagnostic prediction model, and uses it to create an evaluation cohort with probabilities for the health outcome of interest.