This page describes how you can use renv to make sure the right dependencies are installed when people run your study package. Please read the entire page, even when you’re familiar with renv.

What is renv?

The renv package offers two main benefits:

  1. Specifying the exact package versions that must be loaded for executing a specific study. It offers functions for recording the versions of packages that are currently used, and for restoring a library of packages on the same computer (in the future) or other computers, thus greatly improving reproducibility.

  2. Isolating the package library to a (RStudio) project. This means that different projects (e.g. different studies) can use very different package versions on the same computer without conflicting. When switching from one project to another, R will switch to the appropriate library.

The renv lock file

At the core of renv is the ‘renv.lock’ file that specifies all required packages. This file should be in the root folder of the project. The structure of the lock file is not very complicated, and it can easily be edited by hand if need be. An example of a full lock file can be found here. Here’s an example partial lock file, with just 3 entries:

{
    "R" : {
        "Version" : "4.0.3",
        "Repositories" : [
            {
                "Name" : "CRAN",
                "URL" : "https://cloud.r-project.org"
            }
        ]
    },
  "Packages" : {
    "rlang" : {
            "Package" : "rlang",
            "Version" : "0.4.11",
            "Source" : "Repository",
            "Repository" : "CRAN"
        },
    "FeatureExtraction" : {
            "Package" : "FeatureExtraction",
            "Version" : "3.1.1",
            "Source" : "GitHub",
            "RemoteType" : "github",
            "RemoteHost" : "api.github.com",
            "RemoteRepo" : "FeatureExtraction",
            "RemoteUsername" : "ohdsi",
            "RemoteRef" : "v3.1.1"
        },
    "Covid19EstimationHydroxychloroquine" : {
            "Package" : "Covid19EstimationHydroxychloroquine",
            "Version" : "0.0.1",
            "Source" : "GitHub",
            "RemoteType" : "github",
            "RemoteHost" : "api.github.com",
            "RemoteRepo" : "Covid19EstimationHydroxychloroquine",
            "RemoteUsername" : "ohdsi-studies",
            "RemoteRef" : "main"
        }
  }
}

We can see different types of entries in a lock file. All types have a name of the entry, and the following fields:

  • Package: The name of the package. This should match the name of the entry.
  • Version: The package version that needs to be installed.
  • Source: The source where the package can be obtained.

In an OHDSI lock file, the following types can be encountered:

CRAN packages

   "rlang" : {
            "Package" : "rlang",
            "Version" : "0.4.11",
            "Source" : "Repository",
            "Repository" : "CRAN"
        },

Most packages, like rlang, but also several HADES packages like DatabaseConnector, will be available from CRAN, and will therefore have Source equal to “Repository”, and one additional field:

  • Repository: The name of the repository. This is always “CRAN”.

HADES GitHub packages

    "FeatureExtraction" : {
            "Package" : "FeatureExtraction",
            "Version" : "3.1.1",
            "Source" : "GitHub",
            "RemoteType" : "github",
            "RemoteHost" : "api.github.com",
            "RemoteRepo" : "FeatureExtraction",
            "RemoteUsername" : "ohdsi",
            "RemoteRef" : "v3.1.1"
        },

Many HADES packages need to be installed from GitHub rather than CRAN. See the ‘Availability’ column on the Package statuses page to learn which packages are not in CRAN. For these packages, Source is equal to “GitHub”. Other required fields are:

  • RemoteType: Should always be “github”.
  • RemoteHost: Should always be “api.github.com”.
  • RemoteRepo: The name of the HADES package repo, for example “FeatureExtraction”.
  • RemoteUsername: This should always be “ohdsi” for HADES packages.
  • RemoteRef: The GitHub reference. For HADES packages it is recommended to use the git tag of the version, for example “v3.1.1” for version 3.1.1. (In HADES, all package releases are automatically tagged with a “v” prefix and then the version number). This could also be set to the name of a branch (e.g. “main”), but the contents of a branch tend to change over time, so this will break reproducibility.

OHDSI Study packages

    "Covid19EstimationHydroxychloroquine" : {
            "Package" : "Covid19EstimationHydroxychloroquine",
            "Version" : "0.0.1",
            "Source" : "GitHub",
            "RemoteType" : "github",
            "RemoteHost" : "api.github.com",
            "RemoteRepo" : "Covid19EstimationHydroxychloroquine",
            "RemoteUsername" : "ohdsi-studies",
            "RemoteRef" : "main"
        }

As discussed later in this document, sometimes we would like to include an OHDSI study package in the lock file as well. These entries are identical in terms of Source, RemoteType, and RemoteHost to those for HADES GitHub packages, but differ in these fields:

  • RemoteRepo: The name of the OHDSI study package repo, for example “Covid19EstimationHydroxychloroquine”.
  • RemoteUsername: This should always be “ohdsi-studies” for OHDSI study packages.
  • RemoteRef: The GitHub reference. This is often “main”, for the main branch, but can also refer to the name of a git tag, or the hash of a specific git commit.

Creating a renv lock file

One could construct a lock file by hand, but this would take a lot of work. The renv package itself provides the snapshot() function for automatically constructing lock files, but this function only really works well for dependencies from CRAN. For OHDSI studies it is therefore recommended to use the createRenvLockFile() function in the OhdsiRTools package. You can install OhdsiRTools using remotes:

remotes::install_github("ohdsi/OhdsiRTools")

The createRenvLockFile() function has two modes: “auto”, and “description”. The description mode is for more advanced users.

Auto mode

In auto mode, the createRenvLockFile() function will leverage renv::init() to scan all R scripts in the project folder and subfolders for references to packages, and will include those (and their dependencies). The advantage of this mode is that it automatically captures all dependencies. The disadvantage is that this may pull in too many dependencies, making for large lock files, and challenges at partner sites.

The auto mode is the default, and comes with two potentially important arguments:

  • rootPackage: The name of the study package. Only required if includeRootPackage = TRUE.
  • includeRootPackage: Should the root package (i.e. the study package) be included in the lock file. This will be discussed later.

For example, we could create a lock file for the Covid19EstimationHydroxychloroquine study package using this command:

OhdsiRTools::createRenvLockFile(rootPackage = "Covid19EstimationHydroxychloroquine",
                                includeRootPackage = TRUE)

Description mode (Advanced)

In description mode, the createRenvLockFile() function will only include the dependencies of the study package as documented in the package’s DESCRIPTION file. The advantage of this mode is that it can lead to substantially smaller lock files. The disadvantage is that it requires the user to accurately document all dependencies in the DESCRIPTION file.

In description mode, these arguments of createRenvLockFile() are of particular importance:

  • rootPackage: The name of the study package for which we would like to capture all dependencies in the lock file. This package’s dependencies as documented in its DESCRIPTION will be included.
  • additionalRequiredPackages: We may want to have other packages available as well, for example keyring. We can specify those here.
  • includeRootPackage: Should the root package (i.e. the study package) be included in the lock file. This will be discussed later.

Important: The createRenvLockFile() uses the information in the ‘DESCRIPTION’ file of your study package, so make sure this contains all dependencies. A good way to verify if all required packages are in the DESCRIPTION is by running R check on your study package. The function also assumes you have the right package versions installed, so make sure your study package runs before calling createRenvLockFile().

For example, we could create a lock file for the Covid19EstimationHydroxychloroquine study package using this command:

OhdsiRTools::createRenvLockFile(mode = "description",
                                rootPackage = "Covid19EstimationHydroxychloroquine",
                                additionalRequiredPackages  = "keyring",
                                includeRootPackage  = TRUE)

Including the study (root) package

The root package is your study package. If your study package is in the ohdsi-studies GitHub, it is recommended to include it in the lock file. That way, we can use renv to automatically install that package as well. Otherwise, people will have to clone your study package repo, run renv:restore(), and then build the study package itself.

How to use a lock file to restore an environment

There are two ways to use the lock file, depending on whether the root package is included in the lock file.

When the study (root) package is included

When the study package is included in the lock file, as described above, we can use renv to install not only the dependencies, but the study package itself as well. This means the user will not need to clone the study package repo, and does not have to build the study package separately.

Follow these steps to install the study package and all its dependencies:

  1. Make sure R, RTools, Java, and RStudio are properly configured as described here.

  2. Create an empty folder or new RStudio project, and in R, use the following code to install the study package and its dependencies:

    install.packages("renv")
    download.file("https://raw.githubusercontent.com/ohdsi-studies/Covid19SubjectsAesiIncidenceRate/master/renv.lock", "renv.lock")
    renv::init()
    # When asked, choose to "Restore the project from the lockfile".

Where the URL used in download.file() points to the lock file in the ohdsi-studies repo.

Important: the URL should point to the raw lock file contents, not the GitHub page with additional information about the file. So “https://raw.githubusercontent.com/ohdsi-studies/Covid19SubjectsAesiIncidenceRate/master/renv.lock”, not “https://github.com/ohdsi-studies/Covid19SubjectsAesiIncidenceRate/blob/master/renv.lock”.

When the study (root) package is not included

When the study package itself is not included in the lock file, the user will need to clone the project (which requires git to be installed), and build the study package after all dependencies have been loaded.

Follow these steps to install the study package and all its dependencies:

  1. Make sure R, RTools, Java, and RStudio are properly configured as described here.

  2. Make sure git is installed.

  3. Clone the study package repo.

  4. Open the study package project in RStudio, and type renv::restore().

  5. Build the study package.

Adding a function for checking dependencies at runtime

Even though renv tries to make sure all the versions specified in the lock file are loaded in the library, for many reasons it may be that the wrong versions are loaded at runtime. One scenario where this has happened is when the user forgets to run renv before running the study package. To make absolutely sure the right versions are executed, it is recommended to add a function to the study package that checks all dependency versions, and execute that function as a first step in the study’s main function. Below is an example function. Note that this function depends on RJSONIO and dplyr, so make sure these are in your lock file as well.

verifyDependencies <- function() {
  expected <- RJSONIO::fromJSON("renv.lock")
  expected <- dplyr::bind_rows(expected[[2]])
  basePackages <- rownames(installed.packages(priority = "base"))
  expected <- expected[!expected$Package %in% basePackages, ]
  observedVersions <- sapply(sapply(expected$Package, packageVersion), paste, collapse = ".")
  expectedVersions <- sapply(sapply(expected$Version, numeric_version), paste, collapse = ".")
  mismatchIdx <- which(observedVersions != expectedVersions)
  if (length(mismatchIdx) > 0) {

    lines <- sapply(mismatchIdx, function(idx) sprintf("- Package %s version %s should be %s",
                                                       expected$Package[idx],
                                                       observedVersions[idx],
                                                       expectedVersions[idx]))
    message <- paste(c("Mismatch between required and installed package versions. Did you forget to run renv::restore()?",
                       lines),
                     collapse = "\n")
    stop(message)
  }
}

Once this function has been added to the study package it is recommended to add a call to this function to your study’s main function like this.

Testing

Once you’re done creating a lock file, and writing the instructions for how people can install your study package including its dependencies, make sure to test the instructions. Since renv isolates the study project, you can even do this using the same machine used to create the package and lock file (just in a different folder).

Updating a package in the lock file

If you wish to update a dependency in the lock file (like your study package), there are two options, depending on whether the version number of the package has increased:

If the version of the package has increased

If the version number has increased (for example the version number of your study package increased because you changed it in the DESCRIPTION file), you can simply update the version number in the lock file. See details on the lock file earlier in the document. Then, to update the package library in your R project, you can call

renv::restore(packages = c("myPackage"))

Where ‘myPackage’ is the package you’re updating.

If the version of the package is the same

If the package has changed, but the version number hasn’t (for example because you just made a minor change to your study package) you should be aware that renv maintains a package cache. That means that, even if you were to delete the entire project library, renv would still install the old version of the package from the cache, rather than installing it from its source.

To completely purge a package from the renv cache, and install it again, make sure to restart R (so the package isn’t locked), and use:

# Remove the package from the library:
remove.packages("myPackage")

# Purge the package from the renv cache:
renv::purge("myPackage")

# Restore the new version of the package from its source:
renv::restore(packages = c("myPackage"))

Where ‘myPackage’ is the package you’re updating.