CRAN compliant extendr packages
Josiah Parry
Source:vignettes/articles/cran-compliance.Rmd
cran-compliance.Rmd
In order for Rust-based packages to exist on CRAN, there are a number of fairly stringent requirements that must be adhered to. CRAN published Using Rust in CRAN packages in mid-2023, outlining their requirements for building and hosting Rust-based packages.
This article describes CRAN requirements as of the day of writing and illustrates how rextendr can be used to adhere to them.
SystemRequirements
Building Rust-backed packages from source requires the system
dependencies cargo
and rustc
. CRAN has
stipulated their preferred way of tracking this is using the following
line in a packages DESCRIPTION
file.
SystemRequirements: Cargo (Rust's package manager), rustc
Even though this is a free-form field, having consistency can help the whole ecosystem keep track of Rust-based R packages.
cargo
and rustc
availability
In order for an R package to be built from source, cargo
and rustc
need to be available to the machine compiling the
package. The expectation for R packages using external dependencies is
to have a configure
and configure.win
files
that check if the dependencies are available before attempting to
compile the package. If the checks fail, the build process will be
stopped prematurely.
CRAN expects that if cargo
is not on the
PATH
, the user’s home directory is checked at
~/.cargo/bin
. The configuration files must perform these
checks.
cargo build
settings
CRAN also imposes restrictions on how cargo
builds
crates. CRAN has requested that no more than two logical CPUs be used in
the build process. By default, cargo
uses multiple threads
to speed up the compilation process. CRAN policy allows for a maximum of
two. This is set using the -j 2
option, which is passed to
cargo build
.
Additionally, to minimize security risks and ensure package stability, CRAN requires that packages be built completely offline. This prevents external dependencies from being downloaded at compile time. Because of this requirement, vendored dependencies must be used.
Vendored dependencies
Vendoring dependencies is the act of including the dependency itself
in a package source code. In the case of Rust, dependencies are fetched
only at compile time. To enable compilation in an offline environment,
dependencies must be vendored, which is accomplished using the
cargo vendor
command.
cargo vendor
creates a local directory with the default
name vendor
, which contains the source code for each of the
recursive dependencies of the crate that is being built. For CRAN
compatibility, the vendor
directory must be compressed
using tar xz compression and included in the source of the package.
During the build time, the dependencies are extracted, compiled, and
then discarded. This process is controlled by the Makevars
and Makevars.win
files.
Package compilation
All of this comes together during package compilation time, providing all of the following requirements are met:
- cargo must be able to be called from a user’s home directory
- the user’s home directory must not be modified or written to
- the package must be compiled offline
- no more than two logical CPUs are used
- the versions of
cargo
andrustc
are printed
Using CRAN defaults
rextendr provides default CRAN compliant scaffolding via the
use_cran_defaults()
function and appropriate vendoring with
vendor_pkgs()
.
Making a package CRAN compliant
To create a CRAN compliant R package begin by creating a new R
package. Do so by calling usethis::create_package()
. In the
new R project, run rextendr::use_extendr()
to create the
minimal scaffolding necessary for a Rust-powered R package. Once you
have done this, you can now run
rextendr::use_cran_defaults()
.
use_cran_defaults()
will create the
configure
and configure.win
files.
Additionally, it will create new Makevars
and
Makevars.win
that print the versions of cargo
and rustc
as well as use the cargo build
argument -j 2 --offline
.
Vendoring packages
After having configured your R package to use CRAN defaults, you will need to vendor your dependencies.
vendor_pkgs()
runs cargo vendor
on your
behalf, compresses the vendor/
directory, and updates the
vendor-config.toml
file accordingly.
When you have added new dependencies, changed the version or source
of the crates, you should use vendor_pkgs()
again. Doing so
ensures that the compressed vendor.tar.xz
contains the
updates too. This is very important for CI and publishing to CRAN.