DeepPatientLevelPrediction Installation Guide

Introduction

This vignette describes how you need to install the Observational Health Data Science and Informatics (OHDSI) DeepPatientLevelPrediction under Windows, Mac and Linux.

Software Prerequisites

Windows Users

Under Windows the OHDSI Deep Patient Level Prediction (DeepPLP) package requires installing:

R (https://cran.cnr.berkeley.edu/ ) - (R >= 4.0.0, but latest is recommended)
Python - The package is tested with python 3.10, but >= 3.8 should work
Rstudio (https://www.rstudio.com/ )
Java (http://www.java.com )
RTools (https://cran.r-project.org/bin/windows/Rtools/)

Mac/Linux Users

Under Mac and Linux the OHDSI DeepPLP package requires installing:

R (https://cran.cnr.berkeley.edu/ ) - (R >= 4.0.0, but latest is recommended)
Python - The package is tested with python 3.10, but >= 3.8 should work
Rstudio (https://www.rstudio.com/ )
Java (http://www.java.com )
Xcode command line tools(run in terminal: xcode-select –install) [MAC USERS ONLY]

Installing the Package

The preferred way to install the package is by using remotes, which will automatically install the latest release and all the latest dependencies.

If you do not want the official release you could install the bleeding edge version of the package (latest develop branch).

Note that the latest develop branch could contain bugs, please report them to us if you experience problems.

Installing Python environment

Since the package uses pytorch through reticulate a working python installation is required. The package is tested with python 3.10. To install python an easy way is to use miniconda through reticulate:

install.packages('reticulate')
reticulate::install_miniconda()

By default install_minconda() creates an environment r-reticulate with python 3.9. To use instead python 3.10 we can do:

reticulate::conda_install(envname = 'r-reticulate', packages=c('python=3.10'))

If reticulate is having issues finding the conda installation you can use the function reticulate::miniconda_path() to find the default installation location for your miniconda installation. Then you can force reticulate to use the newly generated environment by setting the environment variable RETICULATE_PYTHON to point to the python binary in the environment. For example by adding the following to the .Renviron file:

RETICULATE_PYTHON="/path/to/miniconda/envs/r-reticulate/python/bin"

Then you need to restart you R session. To verify that reticulate finds the correct version. You can call reticulate::py_config().

Once you have a working python environment that reticulate can locate you can install DeepPatientLevelPrediction. If you want to use a specific python environment you can set the environment variable RETICULATE_PYTHON to point to the python executable of that environment in your .Renviron file. You need to do this before installing DeepPatientLevelPrediction.

Installing DeepPatientLevelPrediction using remotes

To install using remotes run:

install.packages("remotes")
remotes::install_github("OHDSI/DeepPatientLevelPrediction")

This should install the required python packages. If that doesn’t happen it can be triggered by calling:

library(DeepPatientLevelPrediction)
torch$randn(10L)

This should print out a tensor with ten different values.

When installing make sure to close any other Rstudio sessions that are using DeepPatientLevelPrediction or any dependency. Keeping Rstudio sessions open can cause locks on windows that prevent the package installing.

Testing Installation

library(PatientLevelPrediction)
library(DeepPatientLevelPrediction)

data(plpDataSimulationProfile)
sampleSize <- 1e3
plpData <- simulatePlpData(
  plpDataSimulationProfile,
  n = sampleSize 
)

populationSettings <- PatientLevelPrediction::createStudyPopulationSettings(
                                                          requireTimeAtRisk = F, 
                                                          riskWindowStart = 1, 
                                                          riskWindowEnd = 365)
# a very simple resnet
modelSettings <- setResNet(numLayers = 2L, 
                           sizeHidden = 64L, 
                           hiddenFactor = 1L,
                           residualDropout = 0, 
                           hiddenDropout = 0.2, 
                           sizeEmbedding = 64L, 
                           estimatorSettings = setEstimator(learningRate = 3e-4,
                                                            weightDecay = 1e-6,
                                                            device='cpu',
                                                            batchSize=128L,
                                                            epochs=3L,
                                                            seed = 42),
                           hyperParamSearch = 'random',
                           randomSample = 1L)

plpResults <- PatientLevelPrediction::runPlp(plpData = plpData,
               outcomeId = 3,
               modelSettings = modelSettings,
               analysisId = 'Test',
               analysisName = 'Testing DeepPlp',
               populationSettings = populationSettings,
               splitSettings = createDefaultSplitSetting(),
               sampleSettings = createSampleSettings(), 
               featureEngineeringSettings = createFeatureEngineeringSettings(), 
               preprocessSettings = createPreprocessSettings(),
               logSettings = createLogSettings(),
               executeSettings = createExecuteSettings(runSplitData = T,
                                                      runSampleData = F,
                                                      runfeatureEngineering = F,
                                                      runPreprocessData = T,
                                                      runModelDevelopment = T,
                                                      runCovariateSummary = T
                                                      ))

Acknowledgments

Considerable work has been dedicated to provide the DeepPatientLevelPrediction package.

citation("DeepPatientLevelPrediction")

## To cite package 'DeepPatientLevelPrediction' in publications use:
## 
##   Fridgeirsson E, Reps J, Chan You S, Kim C, John H (8).
##   _DeepPatientLevelPrediction: Deep Learning For Patient Level
##   Prediction Using Data In The OMOP Common Data Model_. R package
##   version 2.1.0, <https://github.com/OHDSI/DeepPatientLevelPrediction>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {DeepPatientLevelPrediction: Deep Learning For Patient Level Prediction Using Data In The
## OMOP Common Data Model},
##     author = {Egill Fridgeirsson and Jenna Reps and Seng {Chan You} and Chungsoo Kim and Henrik John},
##     year = {8},
##     note = {R package version 2.1.0},
##     url = {https://github.com/OHDSI/DeepPatientLevelPrediction},
##   }

Please reference this paper if you use the PLP Package in your work:

Reps JM, Schuemie MJ, Suchard MA, Ryan PB, Rijnbeek PR. Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J Am Med Inform Assoc. 2018;25(8):969-975.

Egill Fridgeirsson

2024-08-02