Create the evaluation cohort

createEvaluationCohort(
  connectionDetails,
  oracleTempSchema = NULL,
  tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"),
  phenotype,
  analysisName,
  runDateTime,
  databaseId,
  xSpecCohortId,
  daysFromxSpec = 0,
  xSensCohortId,
  prevalenceCohortId,
  caseCohortId,
  caseFirstOccurrenceOnly,
  xSpecCohortSize = 5000,
  cdmDatabaseSchema,
  cohortDatabaseSchema,
  cohortTable,
  workDatabaseSchema,
  covariateSettings = createDefaultCovariateSettings(excludedCovariateConceptIds = c(),
    addDescendantsToExclude = TRUE),
  modelPopulationCohortId = 0,
  modelPopulationCohortIdStartDay = 0,
  modelPopulationCohortIdEndDay = 0,
  inclusionEvaluationCohortId = 0,
  inclusionEvaluationDaysFromStart = 0,
  inclusionEvaluationDaysFromEnd = 0,
  duringInclusionEvaluationOnly = FALSE,
  exclusionEvaluationCohortId = 0,
  exclusionEvaluationDaysFromStart = 0,
  exclusionEvaluationDaysFromEnd = 0,
  priorModelToUse = NULL,
  minimumOffsetFromStart = 365,
  minimumOffsetFromEnd = 365,
  modelBaseSampleSize = 25000,
  baseSampleSize = 2e+06,
  lowerAgeLimit = 0,
  upperAgeLimit = 120,
  visitLength = 0,
  visitType = c(9201, 9202, 9203, 262, 581477),
  gender = c(8507, 8532),
  race = 0,
  ethnicity = 0,
  startDate = "19001010",
  endDate = "21000101",
  falsePositiveNegativeSubjects = 10,
  cdmVersion = "5",
  outFolder = getwd(),
  exportFolder,
  modelId = "main",
  evaluationCohortId = "main",
  excludeModelFromEvaluation = FALSE,
  randomVisitTable = "",
  removeSubjectsWithFutureDates = TRUE,
  saveEvaluationCohortPlpData = FALSE
)

Arguments

connectionDetails

connectionDetails created using the function createConnectionDetails in the DatabaseConnector package.

oracleTempSchema

DEPRECATED: use tempEmulationSchema instead.

tempEmulationSchema

Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.

phenotype

Name of the phenotype for analysis

analysisName

Name of the analysis

runDateTime

Starting date and time of the PheValuator run

databaseId

Name of the database in the analysis

xSpecCohortId

The number of the "extremely specific (xSpec)" cohort definition id in the cohort table (for noisy positives).

daysFromxSpec

Number of days allowed from xSpec condition until analyzed visit

xSensCohortId

The number of the "extremely sensitive (xSens)" cohort definition id in the cohort table (for noisy negatives).

prevalenceCohortId

The number of the cohort definition id to determine the disease prevalence.

caseCohortId

The number of the cohort definition id to determine cases in the evaluation cohort

caseFirstOccurrenceOnly

Set to true if only the first occurrence per subject in the case cohort is to be used

xSpecCohortSize

The recommended xSpec sample size to use in model (default = NULL)

cdmDatabaseSchema

The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specifiy both the database and the schema, so for example 'cdm_instance.dbo'.

cohortDatabaseSchema

The name of the database schema that is the location where the cohort data used to define the at risk cohort is available. Requires read permissions to this database.

cohortTable

The tablename that contains the at risk cohort. The expectation is cohortTable has format of COHORT table: cohort_concept_id, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.

workDatabaseSchema

The name of the database schema that is the location where a table can be created and afterwards removed. Requires write permissions to this database.

covariateSettings

A covariateSettings object as generated using createCovariateSettings().

modelPopulationCohortId

The number of the cohort to be used as a base population for the model. If set to 0, the entire database population will be used.

modelPopulationCohortIdStartDay

The number of days relative to the mainPopulationCohortId cohort start date to begin including visits.

modelPopulationCohortIdEndDay

The number of days relative to the mainPopulationCohortId cohort start date to end including visits.

inclusionEvaluationCohortId

The number of the cohort of the population to be used to designate which visits are eligible to be in the evaluation cohort

inclusionEvaluationDaysFromStart

The number of days from the cohort start date of the inclusionEvaluationCohortId to start eligible included visits

inclusionEvaluationDaysFromEnd

The number of days from the cohort start date of the inclusionEvaluationCohortId to end eligible included visits

duringInclusionEvaluationOnly

Only include visits that are within the cohort start and end dates

exclusionEvaluationCohortId

The number of the cohort of the population to be used to designate which visits are NOT eligible to be in the evaluation cohort

exclusionEvaluationDaysFromStart

The number of days from the cohort start date of the exclusionEvaluationCohortId to start ineligible included visits

exclusionEvaluationDaysFromEnd

The number of days from the cohort start date of the exclusionEvaluationCohortId to end ineligible included visits

priorModelToUse

folder where a previously developed model to use in analysis will be found

minimumOffsetFromStart

Minimum number of days to offset for the analysis visit from the start of the observation period

minimumOffsetFromEnd

Minimum number of days to offset for the analysis visit from the end of the observation period

modelBaseSampleSize

The number of non-xSpec subjects to include in the model

baseSampleSize

The maximum number of subjects in the evaluation cohort.

lowerAgeLimit

The lower age for subjects in the model.

upperAgeLimit

The upper age for subjects in the model.

visitLength

The minimum length of index visit for acute outcomes.

visitType

The concept_id for the visit type.

gender

The gender(s) to be included.

race

The race(s) to be included.

ethnicity

The ethnicity(s) to be included.

startDate

The starting date for including subjects in the model.

endDate

The ending date for including subjects in the model.

falsePositiveNegativeSubjects

Number of subjects to include for evaluating false positives and negatives

cdmVersion

The CDM version of the database.

outFolder

The folder where the output files will be written.

exportFolder

The folder where the csv output files will be written.

modelId

A string used to generate the file names for this model.

evaluationCohortId

A string used to generate the file names for this evaluation cohort.

excludeModelFromEvaluation

Should subjects used in the model be excluded from the evaluation cohort?

randomVisitTable

Table stored in work directory with pre-selected random visits in format of visit_occurrence table

removeSubjectsWithFutureDates

For buggy data with data in the future: ignore subjects with dates in the future?

saveEvaluationCohortPlpData

Should the large PLP file for the evaluation cohort be saved? To be used for debugging purposes.

Details

Fits a diagnostic prediction model, and uses it to create an evaluation cohort with probabilities for the health outcome of interest.