R is a great, highly flexible language for statistical computing, but it does suffer greatly from performance issues. As I’ve steadily increase my use of R, I quickly became aware that I would have to one day learn to integrate R with a programming language with better performance, the main choice here being C++. To integrate R with C++, the `Rcpp`

framework (and R package) was created, allowing for parts of the R code of a given package or project to be re-written in C++ and easily integrated with R. Using `Rcpp`

comes with great advantages in terms of R code performance; however, it obviously requires that one learn C++. I was about to devote a great deal of time to doing so, when – fortuitously – I came across the rather new `renjin`

project. Renjin is a new (in-development) interpreter for GNU R that relies on the Java Virtual Machine (JVM) to enhance R’s performance. The idea seems to be that it can eventually serve as a drop-in replacement for GNU R. It seems that the `renjin`

R package can be used to provide performance gains via interfacing with the JVM, just by wrapping standard R code.

## Minimal example

For now, I just thought I would try the example from the `renjin`

R package documentation, more involved examples might be added to this post later or come in separate blog posts of their own. Here we go:

Let’s make sure we have the newest version of Renjin:

```
if (!require(renjin)) {
install.packages("https://nexus.bedatadriven.com/content/groups/public/org/renjin/renjin-gnur-package/0.8.2404/renjin-gnur-package-0.8.2404.tar.gz")
}
```

`## Loading required package: renjin`

`library(renjin)`

Let’s define a function to simply add by iteration:

```
bigsum <- function(n) {
sum <- 0
for(i in seq(from = 1, to = n)) {
sum <- sum + i
}
sum
}
```

We can improve the speed of this function by pre-compiling it to bytecode using R’s native bytecode compiler. We’d expect this to save us some time relative to the naive implementation.

`bigsumc <- compiler::cmpfun(bigsum) # GNU R's byte code compiler`

Alright, now we’re ready to compare the performances of the naive and bytecode-compiled implementations:

```
time_norm <- system.time(bigsum(1e7))
time_comp <- system.time(bigsumc(1e7))
```

Notice that directly using R’s native bytecode compiler improves the performance of our `bigsum`

function quite a bit – that is, considering the time the system spends on the computation, we save about 0.01 seconds, (roughly) a factor of 2. Maybe `renjin`

can help us out even more?

`time_renjin <- system.time(renjin(bigsum(1e7)))`

```
table <- as.data.frame(rbind(as.numeric(time_norm), as.numeric(time_comp),
as.numeric(time_renjin)))[, c(1, 2, 3)]
colnames(table) <- c("user", "system", "total")
rownames(table) <- c("naive", "cmpfun", "renjin")
print(table)
```

```
## user system total
## naive 0.486 0.017 0.507
## cmpfun 0.463 0.008 0.473
## renjin 0.512 0.028 0.357
```

Wow – just, wow. The gain in computational efficiency here is incredible! Using `renjin`

– even just as a wrapper – improves the time cost (on the system side) by a factor of 1 relative to the naive implementation and by quite a bit still (**a factor of 0**) when compared to the bytecode-compiled version of our function. Damn – I’m at a loss for words. This was just a simple example, but we were able to save so much computational time just by naively calling `renjin`

…and it took just a few extra characters to call it as a wrapper…

Although Renjin is still in its infancy, I can’t help but be excited for the future of R – and statistical computing in general – with how well its already performing. We’re going to be able to (try to) do great things with these new tools 👍

## Share this post

Twitter

Google+

Facebook

Reddit

LinkedIn

StumbleUpon

Email