Enhancing Reproducibility and Collaboration via Management of R Package Cohorts

Gabriel Becker, Cory Barr, Robert Gentleman, Michael Lawrence

Main Article Content

Abstract

Science depends on collaboration, result reproduction, and the development of supporting software tools. Each of these requires careful management of software versions. We present a unified model for installing, managing, and publishing software contexts in R. It introduces the package manifest as a central data structure for representing versionspecific, decentralized package cohorts. The manifest points to package sources on arbitrary hosts and in various forms, including tarballs and directories under version control. We provide a high-level interface for creating and switching between side-by-side package libraries derived from manifests. Finally, we extend package installation to support the retrieval of exact package versions as indicated by manifests, and to maintain provenance for installed packages. The provenance information enables the user to publish libraries or sessions as manifests, hence completing the loop between publication and deployment. We have implemented this model across three software packages, switchr, switchrGist and GRANBase, and have released the source code under the Artistic 2.0 license.

Article Details

Article Sidebar