Run a list of predictions — runPlpAnalyses • PatientLevelPrediction

Run a list of predictions

runPlpAnalyses(
  connectionDetails,
  cdmDatabaseSchema,
  cdmDatabaseName,
  oracleTempSchema = cdmDatabaseSchema,
  cohortDatabaseSchema = cdmDatabaseSchema,
  cohortTable = "cohort",
  outcomeDatabaseSchema = cdmDatabaseSchema,
  outcomeTable = "cohort",
  cdmVersion = 5,
  onlyFetchData = FALSE,
  outputFolder = "./PlpOutput",
  modelAnalysisList,
  cohortIds,
  cohortNames,
  outcomeIds,
  outcomeNames,
  washoutPeriod = 0,
  maxSampleSize = NULL,
  minCovariateFraction = 0,
  normalizeData = T,
  testSplit = "person",
  testFraction = 0.25,
  splitSeed = NULL,
  nfold = 3,
  verbosity = "INFO",
  settings = NULL
)

Arguments

connectionDetails	An R object of type `connectionDetails` created using the function `createConnectionDetails` in the `DatabaseConnector` package.
cdmDatabaseSchema	The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specifiy both the database and the schema, so for example 'cdm_instance.dbo'.
cdmDatabaseName	A string with a shareable name of the database (this will be shown to OHDSI researchers if the results get transported)
oracleTempSchema	For Oracle only: the name of the database schema where you want all temporary tables to be managed. Requires create/insert permissions to this database.
cohortDatabaseSchema	The name of the database schema that is the location where the target cohorts are available. Requires read permissions to this database.
cohortTable	The tablename that contains the target cohorts. Expectation is cohortTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
outcomeDatabaseSchema	The name of the database schema that is the location where the data used to define the outcome cohorts is available. Requires read permissions to this database.
outcomeTable	The tablename that contains the outcome cohorts. Expectation is outcomeTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
cdmVersion	Define the OMOP CDM version used: currently support "4" and "5".
onlyFetchData	Only fetches and saves the data object to the output folder without running the analysis.
outputFolder	Name of the folder where all the outputs will written to.
modelAnalysisList	A list of objects of type `modelSettings` as created using the `createPlpModelSettings` function.
cohortIds	A vector of cohortIds that specify all the target cohorts
cohortNames	A vector of cohortNames corresponding to the cohortIds
outcomeIds	A vector of outcomeIds that specify all the outcome cohorts
outcomeNames	A vector of outcomeNames corresponding to the outcomeIds
washoutPeriod	Minimum number of prior observation days
maxSampleSize	Max number of target people to sample from to develop models
minCovariateFraction	Any covariate with an incidence less than this value if ignored
normalizeData	Whether to normalize the covariates
testSplit	How to split into test/train (time or person)
testFraction	Fraction of data to use as test set
splitSeed	The seed used for the randomization into test/train
nfold	Number of folds used to do cross validation
verbosity	The logging level
settings	Specify the T, O, population, covariate and model settings

Value

A data frame with the following columns:

`analysisId`	The unique identifier for a set of analysis choices.
`cohortId`	The ID of the target cohort populations.
`outcomeId`	The ID of the outcomeId.
`plpDataFolder`	The location where the plpData was saved
`studyPopFile`	The name of the file containing the study population
`evaluationFolder`	The name of file containing the evaluation saved as a csv
`modelFolder`	The name of the file containing the developed model.

Details

Run a list of predictions for the target cohorts and outcomes of interest. This function will run all specified predictions, meaning that the total number of outcome models is `length(cohortIds) * length(outcomeIds) * length(modelAnalysisList)`.