This is a refresher and/or crash course on the essentials of building packages for the R language and programming environment, complete with the documentation necessary for distributing these packages on GitHub and CRAN.
Rather than aiming to build a perfect R package, this tutorial aims to provide only the minimal details necessary for building a functional package. The main goal is to create a repository for custom functions, complete with necessary documentation to make the package useful to others, and to publish the package on CRAN. This tutorial should result in a package that is a collection of custom functions that can be relied on to save you time and improve the reproducibility of your data analytic endeavors.
Just like many other programming languages for scientific computing, R makes use of a modular system for distributing code, which makes the creation and maintenance of specialized R code both organized and manageable.
In R, a variety of especially useful tools have been developed to ease the transition from writing custom functions to building packages consisting of customized R code (and distributing the resulting packages).
Custom R code allows for both the fundamental capabilities and idiosyncratic behavior of R to be modified, and a package makes this code both more easily accessible to you (and others who may find your custom code useful).
roxygen2, like so:
roxygen2 is necessary for generating proper documentation for
the package manual.
devtools provides numerous utilities that make building packages
significantly easier, including
devtools::check(), to name but a few.
Navigate to the parent directory of the package you would like to create
cd DIR or
Build the skeleton for the new package – generating a new directory in the
process – using the R command
R/for R code, (2) A file
NAMESPACEwhich will (later) be populated with function and requirement names, (3) A file
DESCRIPTIONfor required package meta-data, and (4) An RStudio project file
Next, navigate into the package directory and set it up as a Git repository
git init (note: this is not strictly necessary but is a good
practice for any project).
git push) along with GitHub, as the latter will provide public access to the package revision history via GitHub’s site.
Following one of several style guides, set up custom functions in several
.R files in the
R/ subdirectory of the package; best practices
involve thematically organizing functions into distinct
In particular, I recommend following the stylistic advice in Hadley Wickham’s comprehensive book R packages (this book also contains a wealth of other information and tips for writing R packages).
After setting up the desired functions in
.R files, add in the minimal
required documentation for
Note that documentation must be added in front of each defined function in
.R files, using the required format for the
roxygen2 package; this saves
time in the long run by allowing auto-generation of manual pages.
Formal automated testing of code is an important step in ensuring that work is reproducible - specifically, unit testing ensures that code contains fewer bugs, is more robust, and is structured more clearly.
To begin writing unit tests for a package, in the package directory, run
devtools::use_testthat(). This will create a subdirectory
tests/testthat to store individual unit tests for each function, as well as
tests/testthat.R to perform all tests when running
R CMD check.
Write individual test files for each function in the package, with multiple
test_that statements checking various use cases. For advice on
organizing/writing tests, see the this helpful chapter by Hadley
After writing appropriate tests for each function in the
subdirectory, ensure that all tests are working by using
Repeat the above step as necessary to remove any problems brought to light in
the testing process. Once
devtools::test() runs successfully without
catching any errors, move on to the final steps of building and releasing the
Once all desired custom functions, and proper comments for documentation,
have been set up in the
.R files in the
R/ subdirectory, use
devtools::document() to generate package documentation and manual.
The use of
devtools::document() will generate (1) A subdirectory
manual pages (
.Rd files), and (2) a number of
.Rd files (one for each,
function), all of which may be found in the
After the documentation has been properly generated, the package can now be
built and tested: in R, use
devtools::build() while in the main package
directory; this will produce a zipped version of the package in the parent
directory (this can also be done from the command line with
R CMD build
To ensure that the package is working appropriately, use either (1)
check MYPKG.tar.gz on the built version of the package; or (2) while in the
package directory, from R, run
Ensure that the package is working as intended by resolving all issues marked as WARNING or ERROR in the results produced by running the check.
Assuming that Git was used with the repository, the package will be available
from GitHub, and may be installed using
devtools::install_github("USER/REPO") within R.
Submit the package to CRAN (this
can also be done with
devtools::submit_cran() in R); after it is accepted,
the package will be available for download with
devtools::create("MYPKG") - generates a package skeleton as described above.
devtools::document() - generates package documentation using the
style comments preceding each function in the various
devtools::use_build_ignore("FILES") - adds named files to
with proper syntax. This is necessary for files not approved by CRAN.
devtools::use_testthat() - adds a subdirectory
tests/testthat for writing
individual tests and a file
tests/testthat.R to run all tests when
check is used.
devtools::use_travis() - adds a
.travis.yml config file to the repository
to be used with Travis CI.
devtools::test() - runs all of the available tests that are present in the
tests/testthat to ensure that any functions with tests are
working as intended.
devtools::check() - builds the package and performs necessary checks to
ensure that everything is running smoothly (or points out errors). This is a
bit more thorough than
R CMD check.
devtools::build() - generates the package manual and compiles other
necessary aspects, ultimately resulting in a zipped (
.tar.gz) package file.
devtools::build_win() - builds and submits the package to CRAN win-builder
for checking, with a status report generated roughly 20 minutes later. This
conveniently checks against r-devel.
devtools::release() - builds the package, performs
R CMD check, asks
various questions, then uploads the bundle to CRAN. Preferable to
devtools::submit_cran() since this is more thorough.
devtools::submit_cran() - builds and submits the package to CRAN, avoiding
the (annoying) interface.
R CMD build MYPKG - (from the command line) builds the package when run in
the parent directory, generating a zipped (
.tar.gz) package file.
R CMD check MYPKG.tar.gz - (from the command line) runs necessary checks on
a built package, pointing out any warnings and errors that need correction.
R CMD check --as-cran MYPKG - (from the command line) runs checks similar to
the above but with additional requirements specific to CRAN that are necessary
for successful submission.