Computes operating characteristics (sensitivity, specificity, positive predictive value, AUC, and Cohen's kappa) of a cohort definition by comparing it against a reference cohort created from LLM review of KEEPER profiles. Metrics are computed separately for high-certainty reviews, low-certainty reviews, and all reviews combined.

computeCohortOperatingCharacteristics(
  connectionDetails = NULL,
  connection = NULL,
  cohortDatabaseSchema,
  cohortTable,
  cohortDefinitionId,
  referenceCohortDatabaseSchema,
  referenceCohortTableNames,
  referenceCohortDefinitionId
)

Arguments

connectionDetails

An R object of type connectionDetails created using the DatabaseConnector::createConnectionDetails() function. Not required of connection is provided.

connection

The connection to the database server created using DatabaseConnector::connect(). Not required if connectionDetails is provided.

cohortDatabaseSchema

The name of the database schema containing the cohort to evaluate.

cohortTable

The table name containing the cohort to evaluate.

cohortDefinitionId

The cohort definition ID of the cohort to evaluate.

referenceCohortDatabaseSchema

The name of the database schema containing the reference cohort (as uploaded by uploadReferenceCohort()).

referenceCohortTableNames

The table names where the reference cohort and metadata are stored. Should be created using [createReferenceCohortTableNames())].

[createReferenceCohortTableNames())]: R:createReferenceCohortTableNames())

referenceCohortDefinitionId

The cohort definition ID of the reference cohort.

Value

A tibble with one row per certainty level ("high", "low", "all") and columns for true positives, false positives, true negatives, false negatives, sensitivity, specificity, PPV (each with lower and upper confidence bounds), AUC, kappa, disease prevalence and certainty.

Specificity and prevalence are computed both within the reference cohort, and, based on the prevalence of the highly- sensitive cohort, also in the overall population.