Quick Install Guide


Intalling the R package

The preferred way to install the package is by using drat, which will automatically install the latest release and all the latest dependencies. If the drat code fails or you do not want the official release you could use devtools to install the bleading edge version of the package (latest master). Note that the latest master could contain bugs, please report them to us if you experience problems.

To install using drat run:

install.packages("drat")
drat::addRepo("OHDSI")
install.packages("PatientLevelPrediction")

To install using devtools run:

install.packages('devtools')
devtools::install_github("OHDSI/FeatureExtraction")
devtools::install_github('ohdsi/PatientLevelPrediction')

When installing using devtools make sure to close any other Rstudio sessions that are using PatientLevelPrediction or any dependency. Keeping Rstudio sessions open can cause locks that prevent the package installing.

Setting up Python

Many of the classifiers in PatientLevelPrediction use python. To use the python classifiers you need to install and set up the a python environment in R. We used the reticulate package:

library(PatientLevelPrediction)
reticulate::install_miniconda()
configurePython(envname='r-reticulate', envtype='conda')

To add the R keras interface, in Rstudio run:

devtools::install_github("rstudio/keras")
library(keras)
install_keras()

Some of the less frequently used classifiers are considered optional and are not installed by default. To install then, run:

For GBM survival:

reticulate::conda_install(envname='r-reticulate', packages = c('scikit-survival'), forge = TRUE, pip = FALSE, pip_ignore_installed = TRUE, conda = "auto", channel = 'sebp')

For any of the torch models:

reticulate::conda_install(envname='r-reticulate', packages = c('pytorch', 'torchvision', 'cpuonly'), forge = TRUE, pip = FALSE, channel = 'pytorch', pip_ignore_installed = TRUE, conda = 'auto')

Testing the PatientLevelPrediction Installation

To test whether the package is installed correctly run:

library(PatientLevelPrediction)
library(DatabaseConnector)
connectionDetails <- createConnectionDetails(dbms = 'sql_server', 
                                             user = 'username', 
                                             password = 'hidden', 
                                             server = 'your server', 
                                             port = 'your port')
PatientLevelPrediction::checkPlpInstallation(connectionDetails = connectionDetails, 
                                             python = T)

To test the installation (excluding python) run:

library(PatientLevelPrediction)
library(DatabaseConnector)
connectionDetails <- createConnectionDetails(dbms = 'sql_server', 
                                           user = 'username', 
                                           password = 'hidden', 
                                           server = 'your server', 
                                           port = 'your port')
PatientLevelPrediction::checkPlpInstallation(connectionDetails = connectionDetails, 
                                             python = F)

The check can take a while to run since it will build the following models in sequence on simulated data: Logistic Regression, RandomForest, MLP, AdaBoost, Decision Tree, Naive Bayes, KNN, Gradient Boosting. Moreover, it will test the database connection.

Common issues

python environment Mac/linux users:

to make sure R uses the r-reticulate python environment you may need to edit your .Rprofile with the location of the python binary for the PLP environment. Edit the .Rprofile by running:

usethis::edit_r_profile()

and add

Sys.setenv(PATH = paste("your python bin location", Sys.getenv("PATH"), sep=":"))

to the file then save. Where your python bin location is the location returned by

reticulate::conda_list() 

e.g., My PLP virtual environment location was /anaconda3/envs/PLP/bin/python so I added:
Sys.setenv(PATH = paste(“/anaconda3/envs/PLP/bin”, Sys.getenv(“PATH”), sep=“:”))

Old Instructions

To configure python via anaconda

  • Close your RStudio
  • Install python 3.7 using anaconda (https://www.anaconda.com/download) [make sure you pick the correct operating system] and note the installation location. Anaconda should update you path variable with the python binary.
  • Open a new Rstudio and check whether your python is configured correctly by running:
system("python --version") 

If set up correctly you should see “Python 3.x.x :: Anaconda, Inc.” returned.

  • If not set up correctly then:
    • Windows users: make sure your anaconda python binary is in the System PATH environmental variable: go to my computer -> system properties -> advanced system settings Then at the bottom right you’ll see a button: Environmental Variables, clicking on that will enable you to edit the PATH variable. Add the following Anaconda locations to your path: D:\Anaconda3;D:\Anaconda3\Scripts;D:\Anaconda3\Library\bin (this assumes your installation was done to D:\Anaconda3).
    • Mac/Linux users: edit the bash profile to add python in the Path by running in the terminal: touch ~/.bash_profile; open ~/.bash_profile; and adding in the location of python in the PATH variable. Unfortunately, you also need to make an edit to the .Rprofile for R to get the correct PATH. To do this open the .Rprofile by running:
  usethis::edit_r_profile()

and in this file add

Sys.setenv(PATH = paste("your python bin location", Sys.getenv("PATH"), sep=":"))
  • After editing your Path or .Rprofile open a new Rstudio session and test that python is correctly set up via
system("python --version")