Package index
Extracting data from the OMOP CDM database
Functions for getting the necessary data from the database in Common Data Model and saving/loading.
-
createDatabaseDetails()
- Create a setting that holds the details about the cdmDatabase connection for data extraction
-
createRestrictPlpDataSettings()
- createRestrictPlpDataSettings define extra restriction settings when calling getPlpData
-
getPlpData()
- Extract the patient level prediction data from the server
-
getEunomiaPlpData()
- Create a plpData object from the Eunomia database'
-
savePlpData()
- Save the plpData to folder
-
loadPlpData()
- Load the plpData from a folder
-
getCohortCovariateData()
- Extracts covariates based on cohorts
-
print(<plpData>)
- Print a plpData object
-
print(<summary.plpData>)
- Print a summary.plpData object
-
summary(<plpData>)
- Summarize a plpData object
-
createStudyPopulationSettings()
- create the study population settings
-
createDefaultSplitSetting()
- Create the settings for defining how the plpData are split into test/validation/train sets using default splitting functions (either random stratified by outcome, time or subject splitting)
-
createExistingSplitSettings()
- Create the settings for defining how the plpData are split into test/validation/train sets using an existing split - good to use for reproducing results from a different run
-
createSampleSettings()
- Create the settings for defining how the trainData from
splitData
are sampled using default sample functions.
-
createFeatureEngineeringSettings()
- Create the settings for defining any feature engineering that will be done
-
createPreprocessSettings()
- Create the settings for preprocessing the trainData.
-
createCohortCovariateSettings()
- Extracts covariates based on cohorts
-
createRandomForestFeatureSelection()
- Create the settings for random foreat based feature selection
-
createUnivariateFeatureSelection()
- Create the settings for defining any feature selection that will be done
-
createSplineSettings()
- Create the settings for adding a spline for continuous variables
-
createStratifiedImputationSettings()
- Create the settings for using stratified imputation.
-
createNormalizer()
- Create the settings for normalizing the data @param type The type of normalization to use, either "minmax" or "robust"
-
createSimpleImputer()
- Create Simple Imputer settings
-
createIterativeImputer()
- Create Iterative Imputer settings
-
createRareFeatureRemover()
- Create the settings for removing rare features
-
createValidationDesign()
- createValidationDesign - Define the validation design for external validation
-
validateExternal()
- validateExternal - Validate model performance on new data
-
createValidationSettings()
- createValidationSettings define optional settings for performing external validation
-
recalibratePlp()
- recalibratePlp
-
recalibratePlpRefit()
- recalibratePlpRefit
-
createLogSettings()
- Create the settings for logging the progression of the analysis
-
createExecuteSettings()
- Creates list of settings specifying what parts of runPlp to execute
-
createDefaultExecuteSettings()
- Creates default list of settings specifying what parts of runPlp to execute
-
setAdaBoost()
- Create setting for AdaBoost with python DecisionTreeClassifier base estimator
-
setDecisionTree()
- Create setting for the scikit-learn DecisionTree with python
-
setGradientBoostingMachine()
- Create setting for gradient boosting machine model using gbm_xgboost implementation
-
setLassoLogisticRegression()
- Create modelSettings for lasso logistic regression
-
setMLP()
- Create setting for neural network model with python's scikit-learn. For bigger models, consider using
DeepPatientLevelPrediction
package.
-
setNaiveBayes()
- Create setting for naive bayes model with python
-
setRandomForest()
- Create setting for random forest model using sklearn
-
setSVM()
- Create setting for the python sklearn SVM (SVC function)
-
setIterativeHardThresholding()
- Create setting for Iterative Hard Thresholding model
-
setLightGBM()
- Create setting for gradient boosting machine model using lightGBM (https://github.com/microsoft/LightGBM/tree/master/R-package).
-
setCoxModel()
- Create setting for lasso Cox model
Single Patient-Level Prediction Model
Functions for training/evaluating/applying a single patient-level-prediction model
-
runPlp()
- runPlp - Develop and internally evaluate a model using specified settings
-
externalValidateDbPlp()
- externalValidateDbPlp - Validate a model on new databases
-
savePlpModel()
- Saves the plp model
-
loadPlpModel()
- loads the plp model
-
savePlpResult()
- Saves the result from runPlp into the location directory
-
loadPlpResult()
- Loads the evalaution dataframe
-
diagnosePlp()
- diagnostic - Investigates the prediction problem settings - use before training a model
Multiple Patient-Level Prediction Models
Functions for training multiple patient-level-prediction model in an efficient way.
-
createModelDesign()
- Specify settings for developing a single model
-
runMultiplePlp()
- Run a list of predictions analyses
-
validateMultiplePlp()
- externally validate the multiple plp models across new datasets
-
savePlpAnalysesJson()
- Save the modelDesignList to a json file
-
loadPlpAnalysesJson()
- Load the multiple prediction json settings from a file
-
diagnoseMultiplePlp()
- Run a list of predictions diagnoses
-
createStudyPopulation()
- Create a study population
-
splitData()
- Split the plpData into test/train sets using a splitting settings of class
splitSettings
-
preprocessData()
- A function that wraps around FeatureExtraction::tidyCovariateData to normalise the data and remove rare or redundant features
-
fitPlp()
- fitPlp
-
predictPlp()
- predictPlp
-
evaluatePlp()
- evaluatePlp
-
covariateSummary()
- covariateSummary
Saving results into database
Functions for saving the prediction model and performances into a database.
-
insertResultsToSqlite()
- Create sqlite database with the results
-
createPlpResultTables()
- Create the results tables to store PatientLevelPrediction models and results into a database
-
createDatabaseSchemaSettings()
- Create the PatientLevelPrediction database result schema settings
-
extractDatabaseToCsv()
- Exports all the results from a database into csv files
-
insertCsvToDatabase()
- Function to insert results into a database from csvs
-
migrateDataModel()
- Migrate Data model
-
viewPlp()
- viewPlp - Interactively view the performance and model settings
-
viewMultiplePlp()
- open a local shiny app for viewing the result of a multiple PLP analyses
-
viewDatabaseResultPlp()
- open a local shiny app for viewing the result of a PLP analyses from a database
-
plotPlp()
- Plot all the PatientLevelPrediction plots
-
plotSparseRoc()
- Plot the ROC curve using the sparse thresholdSummary data frame
-
plotSmoothCalibration()
- Plot the smooth calibration as detailed in Calster et al. "A calibration heirarchy for risk models was defined: from utopia to empirical data" (2016)
-
plotSparseCalibration()
- Plot the calibration
-
plotSparseCalibration2()
- Plot the conventional calibration
-
plotNetBenefit()
- Plot the net benefit
-
plotDemographicSummary()
- Plot the Observed vs. expected incidence, by age and gender
-
plotF1Measure()
- Plot the F1 measure efficiency frontier using the sparse thresholdSummary data frame
-
plotGeneralizability()
- Plot the train/test generalizability diagnostic
-
plotPrecisionRecall()
- Plot the precision-recall curve using the sparse thresholdSummary data frame
-
plotPredictedPDF()
- Plot the Predicted probability density function, showing prediction overlap between true and false cases
-
plotPreferencePDF()
- Plot the preference score probability density function, showing prediction overlap between true and false cases #'
-
plotPredictionDistribution()
- Plot the side-by-side boxplots of prediction distribution, by class
-
plotVariableScatterplot()
- Plot the variable importance scatterplot
-
outcomeSurvivalPlot()
- Plot the outcome incidence over time
-
createLearningCurve()
- createLearningCurve
-
plotLearningCurve()
- plotLearningCurve
-
simulatePlpData()
- Generate simulated data
-
simulationProfile
- A simulation profile for generating synthetic patient level prediction data
-
toSparseM()
- Convert the plpData in COO format into a sparse R matrix
-
MapIds()
- Map covariate and row Ids so they start from 1
-
listAppend()
- join two lists
-
listCartesian()
- Cartesian product
-
createTempModelLoc()
- Create a temporary model location
-
configurePython()
- Sets up a python environment to use for PLP (can be conda or venv)
-
setPythonEnvironment()
- Use the python environment created using configurePython()
-
averagePrecision()
- Calculate the average precision
-
brierScore()
- brierScore
-
calibrationLine()
- calibrationLine
-
computeAuc()
- Compute the area under the ROC curve
-
ici()
- Calculate the Integrated Calibration Index from Austin and Steyerberg https://onlinelibrary.wiley.com/doi/full/10.1002/sim.8281
-
modelBasedConcordance()
- Calculate the model-based concordance, which is a calculation of the expected discrimination performance of a model under the assumption the model predicts the "TRUE" outcome as detailed in van Klaveren et al. https://pubmed.ncbi.nlm.nih.gov/27251001/
-
computeGridPerformance()
- Computes grid performance with a specified performance function
-
getCalibrationSummary()
- Get a sparse summary of the calibration
-
getDemographicSummary()
- Get a demographic summary
-
getThresholdSummary()
- Calculate all measures for sparse ROC
-
getPredictionDistribution()
- Calculates the prediction distribution
-
sklearnFromJson()
- Loads sklearn python model from json
-
sklearnToJson()
- Saves sklearn python model object to json in path
-
savePlpShareable()
- Save the plp result as json files and csv files for transparent sharing
-
loadPlpShareable()
- Loads the plp result saved as json/csv files for transparent sharing
-
loadPrediction()
- Loads the prediction dataframe to json
-
savePrediction()
- Saves the prediction dataframe to a json file
-
pfi()
- Permutation Feature Importance
-
predictCyclops()
- Create predictive probabilities
-
predictGlm()
- predict using a logistic regression model
-
createGlmModel()
- createGlmModel
-
createSklearnModel()
- Plug an existing scikit learn python model into the PLP framework