Reproducible Environments

Megumi Oshima & Nicholas Ducharme-Barth

January 2025

What is a reproducible environment?

  • A computing setup where all the necessary components are explicitly defined
    • OS, package versions, software, etc.
  • Anyone can replicate that exact coding environment and the results will be the same across different machines

Signs you want to use a reproducible environment

  • Code that used to run no longer runs even though you haven’t changed the code itself
  • If you upgrade to a new version of a package or software and things break
  • You want to share your code with a colloaborator and ensure they get the same results you get
  • You want to run your code in some other environment besides your local computer

How can we do this?

renv is an R package to help create reproducible environments

renv

  • Isloated - each project has it’s own library
  • Portable - easy to install on different computers (even OS)
  • Reproducible - ensures you are using the exact package versions

renv Workflow

renv::init()

renv::snapshot()

renv::restore()

renv::install()

New project

Project library

Lockfile

CRAN/GitHub

Initalize project

  • When you first start a project, you will want to initialize an renv project

renv::init()

  • Creates 3 items:

renv/library/

renv.lock

.Rprofile

Installing new packages

renv::install() - can specify the location and version you want

As you install new packages you will need to update the lockfile

Using the lockfile for collaboration

renv::snapshot() - updates lockfile with metadata about all currently-used packages

renv::restore() - reproduces the environment specified in the lockfile

Another use for renv

  • Can use a lockfile for GitHub Actions (or other continuous integration systems)
  • Use the r-lib setup-renv action in your workflow

jobs:
  build-deploy:
    runs-on: ubuntu-latest
    steps:
    - name: Install packages using renv
      uses: r-lib/actions/setup-renv@v2

Questions

Resources