#[extendr(use_try_from = true)]
fn ultimate_answer() -> i32 {
return 42_i32;
}
Conversion to and from R data
One of the key goals with extendr, is to provide a framework that allows you to write Rust functions, that interact with R, without having to know the intricacies within R internals, or even R’s C-facilities. However, this is unavoidable if one wishes to understand why the extendr-api is the way it is.
Thus, for introducing extendr, we shall mention facts about R internals, but these are not necessary to keep in mind going forward.
A fundamental data-type in R is the 32-bit integer, int
in C, and i32
in Rust. Passing that type around is essential, and straight forward:
And now this function is available within your R-session, as the output is 42.
Also, another fundamental data-type in R is numeric
/ f64
, which we can also pass back and forth uninhibitated, e.g.
#[extendr(use_try_from = true)]
fn return_tau() -> f64 {
std::f64::consts::TAU
}
where \(\tau := 2\pi =\) \(6.2831853\).
However, passing data from R to Rust must be done with a bit of care: In R, representing a true integer in literal form requires using L
after the literal.
#[extendr(use_try_from = true)]
fn bit_left_shift_once(number: i32) -> i32 {
<< 1
number }
This function supposedly is a clever way to multiply by two, however passing bit_left_shift_once(21.1)
results in
Error in bit_left_shift_once(21.1): Expected an integer or a float representing a whole number, got 21.1
where bit_left_shift_once(21)
is 42, as expected.
R also has the concept of missing numbers, NA
encoded within its data-model. However i32
/f64
do not natively have a representation for NA
e.g.
bit_left_shift_once(NA_integer_)
Error in bit_left_shift_once(NA_integer_): Must not be NA.
bit_left_shift_once(NA_real_)
Error in bit_left_shift_once(NA_real_): Must not be NA.
bit_left_shift_once(NA)
Error in bit_left_shift_once(NA): Must not be NA.
Instead, we have to rely on extendr’s scalar variants of R types, Rint
/ Rfloat
to encompass the notion of NA
in our functions:
#[extendr(use_try_from = true)]
fn double_me(value: Rint) -> Rint {
if value.is_na() {
Rint::na()
} else {
.inner() << 1).into()
(value}
}
which means, we can now handle missing values in the arguments
double_me(NA_integer_)
[1] NA
double_me(NA_real_)
[1] NA
double_me(NA)
[1] NA
One may notice here that NA_real_
was accepted even for an Rint
. The reason for this, is when you specify a type without &
/&mut
, the value is coerced in a similar way, as R coerces values. In order to have strict type-checking during run-time, use &
/ &mut
, as
#[extendr(use_try_from = true)]
fn wrong_input(value: &Rint) -> Rint {
.clone()
value}
wrong_input(NA_integer_)
Error in wrong_input(NA_integer_): Must not be NA.
wrong_input(NA_real_)
Error in wrong_input(NA_real_): expected 13, got 14
wrong_input(21.0)
Error in wrong_input(21): expected 13, got 14
wrong_input(21L)
[1] 21
Here, only the last literal is a true Rint
.
Vectors
Most data in R are vectors. Scalar values are in fact 1-sized vectors, and even lists are defined by a vector-type. A vector type in Rust is Vec
. A Vec
has a type-information, length, and capacity. This means, that if necessary, we may expand any given Vec
-data to contain more values, and only when capacity is exceeded, will there be a reallocation.
Naively, we may define a function like so
#[extendr(use_try_from = true)]
fn repeat_us(mut values: Vec<i32>) -> Vec<i32> {
assert_eq!(values.capacity(), values.len(), "must have zero capacity left");
0] = 100;
values[.push(55);
values
values}
<- c(1L, 2L, 33L)
x repeat_us(x)
[1] 100 2 33 55
Even if the argument is mut Vec<_>
, what happens is that the R vector gets converted to a Rust owned type, and it is that type that we can modify, and augment, with syncing to the original data.
Of course, a slice e.g. &[i32]
/ &mut [i32]
could be used instead, and this allows us to modify the original data, i.e.
#[extendr(use_try_from = true)]
fn zero_middle_element(values: &mut [i32]) {
let len = values.len();
let middle = len / 2;
= 0;
values[middle] }
<- c(100L, 200L, 300L)
x zero_middle_element(x)
x
[1] 100 0 300
This is great! If we wanted to insert an NA
in the middle, we would have had to operate on &mut [Rint]
instead.
A slice is a representation of a sequence of elements that are part of a larger collection. Since they represent only part of a collection (vector, in this case), we cannot add new elements to this. To do so, we have to rely on extendr provided types, that provide a Vec
-like API to R’s vector-types. These are the Integers
, Logicals
, Doubles
, and Strings
types.