Extracting data from the OMOP CDM databaseFunctions for getting the necessary data from the database in Common Data Model and saving/loading. |
|
---|---|
Create a setting that holds the details about the cdmDatabase connection for data extraction |
|
createRestrictPlpDataSettings define extra restriction settings when calling getPlpData |
|
Get the patient level prediction data from the server |
|
Save the cohort data to folder |
|
Load the cohort data from a folder |
|
Extracts covariates based on cohorts |
|
Settings for designing a prediction modelsDesign settings required when developing a model. |
|
create the study population settings |
|
Create the settings for defining how the plpData are split into test/validation/train sets using default splitting functions (either random stratified by outcome, time or subject splitting) |
|
Create the settings for defining how the trainData from |
|
Create the settings for defining any feature engineering that will be done |
|
Create the settings for preprocessing the trainData. |
|
Optional design settingsSettings for optional steps that can be used in the PLP pipeline |
|
Extracts covariates based on cohorts |
|
Create the settings for random foreat based feature selection |
|
Create the settings for defining any feature selection that will be done |
|
Create the settings for adding a spline for continuous variables |
|
Create the settings for adding a spline for continuous variables |
|
External validation |
|
createValidationDesign - Define the validation design for external validation |
|
externalValidatePlp - Validate model performance on new data |
|
createValidationSettings define optional settings for performing external validation |
|
recalibratePlp |
|
recalibratePlpRefit |
|
Execution settings when developing a modelExecution settings required when developing a model. |
|
Create the settings for logging the progression of the analysis |
|
Creates list of settings specifying what parts of runPlp to execute |
|
Creates default list of settings specifying what parts of runPlp to execute |
|
Binary Classification ModelsFunctions for setting binary classifiers and their hyper-parameter search. |
|
Create setting for AdaBoost with python DecisionTreeClassifier base estimator |
|
Create setting for the scikit-learn 1.0.1 DecisionTree with python |
|
Create setting for gradient boosting machine model using gbm_xgboost implementation |
|
Create setting for knn model |
|
Create setting for lasso logistic regression |
|
Create setting for neural network model with python |
|
Create setting for naive bayes model with python |
|
Create setting for random forest model with python (very fast) |
|
Create setting for the python sklearn SVM (SVC function) |
|
Create setting for lasso logistic regression |
|
Create setting for gradient boosting machine model using lightGBM (https://github.com/microsoft/LightGBM/tree/master/R-package). |
|
Survival ModelsFunctions for setting survival models and their hyper-parameter search. |
|
Create setting for lasso Cox model |
|
Single Patient-Level Prediction ModelFunctions for training/evaluating/applying a single patient-level-prediction model |
|
runPlp - Develop and internally evaluate a model using specified settings |
|
externalValidateDbPlp - Validate a model on new databases |
|
Saves the plp model |
|
loads the plp model |
|
Saves the result from runPlp into the location directory |
|
Loads the evalaution dataframe |
|
diagnostic - Investigates the prediction problem settings - use before training a model |
|
Multiple Patient-Level Prediction ModelsFunctions for training mutliple patient-level-prediction model in an efficient way. |
|
Specify settings for deceloping a single model |
|
Run a list of predictions analyses |
|
externally validate the multiple plp models across new datasets |
|
Save the modelDesignList to a json file |
|
Load the multiple prediction json settings from a file |
|
Run a list of predictions diagnoses |
|
Individual pipeline functionsFunctions for running parts of the PLP workflow |
|
Create a study population |
|
Split the plpData into test/train sets using a splitting settings of class |
|
A function that wraps around FeatureExtraction::tidyCovariateData to normalise the data and remove rare or redundant features |
|
fitPlp |
|
predictPlp |
|
evaluatePlp |
|
covariateSummary |
|
Saving results into databaseFunctions for saving the prediction model and performances into a database. |
|
Create sqlite database with the results |
|
Create the results tables to store PatientLevelPrediction models and results into a database |
|
Populate the PatientLevelPrediction results tables |
|
Function to add the run plp (development or validation) to database |
|
Create the PatientLevelPrediction database result schema settings |
|
Create a list with the database details and database meta data entries |
|
Insert a diagnostic result into a PLP result schema database |
|
Insert mutliple diagnosePlp results saved to a directory into a PLP result schema database |
|
Exports all the results from a database into csv files |
|
Function to insert results into a database from csvs |
|
Insert a model design into a PLP result schema database |
|
Migrate Data model |
|
Shiny ViewersFunctions for viewing results via a shiny app |
|
viewPlp - Interactively view the performance and model settings |
|
open a local shiny app for viewing the result of a multiple PLP analyses |
|
open a local shiny app for viewing the result of a PLP analyses from a database |
|
PlottingFunctions for various performance plots |
|
Plot all the PatientLevelPrediction plots |
|
Plot the ROC curve using the sparse thresholdSummary data frame |
|
Plot the smooth calibration as detailed in Calster et al. "A calibration heirarchy for risk models was defined: from utopia to empirical data" (2016) |
|
Plot the calibration |
|
Plot the conventional calibration |
|
Plot the Observed vs. expected incidence, by age and gender |
|
Plot the F1 measure efficiency frontier using the sparse thresholdSummary data frame |
|
Plot the train/test generalizability diagnostic |
|
Plot the precision-recall curve using the sparse thresholdSummary data frame |
|
Plot the Predicted probability density function, showing prediction overlap between true and false cases |
|
Plot the preference score probability density function, showing prediction overlap between true and false cases #' |
|
Plot the side-by-side boxplots of prediction distribution, by class#' |
|
Plot the variable importance scatterplot |
|
Plot the outcome incidence over time |
|
Learning CurvesFunctions for creating and plotting learning curves |
|
createLearningCurve |
|
plotLearningCurve |
|
SimulationFunctions for simulating cohort method data objects. |
|
Generate simulated data |
|
A simulation profile |
|
Data manipulation functionsFunctions for manipulating data |
|
Convert the plpData in COO format into a sparse R matrix |
|
Map covariate and row Ids so they start from 1 |
|
Helper/utility functions |
|
join two lists |
|
Cartesian product |
|
Create a temporary model location |
|
Sets up a virtual environment to use for PLP (can be conda or python) |
|
Use the virtual environment created using configurePython() |
|
Evaluation measures |
|
Calculate the accuracy |
|
Calculate the average precision |
|
brierScore |
|
calibrationLine |
|
Compute the area under the ROC curve |
|
Calculate the f1Score |
|
Calculate the falseDiscoveryRate |
|
Calculate the falseNegativeRate |
|
Calculate the falseOmissionRate |
|
Calculate the falsePositiveRate |
|
Calculate the Integrated Calibration Information from Austin and Steyerberg https://onlinelibrary.wiley.com/doi/full/10.1002/sim.8281 |
|
Calculate the model-based concordance, which is a calculation of the expected discrimination performance of a model under the assumption the model predicts the "TRUE" outcome as detailed in van Klaveren et al. https://pubmed.ncbi.nlm.nih.gov/27251001/ |
|
Calculate the negativeLikelihoodRatio |
|
Calculate the negativePredictiveValue |
|
Calculate the positiveLikelihoodRatio |
|
Calculate the positivePredictiveValue |
|
Calculate the sensitivity |
|
Calculate the specificity |
|
Computes grid performance with a specified performance function |
|
Calculate the diagnostic odds ratio |
|
Get a sparse summary of the calibration |
|
Get a calibration per age/gender groups |
|
Calculate all measures for sparse ROC |
|
Calculate all measures for sparse ROC when prediction is bianry classification |
|
Calculates the prediction distribution |
|
Calculates the prediction distribution |
|
Saving/loading models as jsonFunctions for saving or loading models as json |
|
Loads sklearn python model from json |
|
Saves sklearn python model object to json in path |
|
Load/save for sharingFunctions for loading/saving objects for sharing |
|
Save the plp result as json files and csv files for transparent sharing |
|
Loads the plp result saved as json/csv files for transparent sharing |
|
Loads the prediciton dataframe to csv |
|
Saves the prediction dataframe to RDS |
|
Feature importance |
|
pfi |
|
Other functions |
|
Create predictive probabilities |