[Rcpp-devel] R vectorisation vs. C++ vectorisation

Hadley Wickham h.wickham at gmail.com
Mon Nov 19 16:31:56 CET 2012


Hi all,

Inspired by "Rcpp is smoking fast for agent-based models in data
frames" (http://www.babelgraph.org/wp/?p=358), I've been doing some
exploration of vectorisation in R vs C++ at
https://gist.github.com/4111256

I have five versions of the basic vaccinate function:

* vacc1: vectorisation in R with a for loop
* vacc2: used vectorised R primitives
* vacc3: vectorised with loop in C++
* vacc4: vectorised with Rcpp sugar
* vacc5: vectorised with Rcpp sugar, explicitly labelled as containing
no missing values

And the timings I get are as follows:

Unit: microseconds
                    expr    min     lq median     uq     max neval
 vacc1(age, female, ily) 6816.8 7139.4 7285.7 7823.9 10055.5   100
 vacc2(age, female, ily)  194.5  202.6  212.6  227.9   260.4   100
 vacc3(age, female, ily)   21.8   22.4   23.4   24.9    35.5   100
 vacc4(age, female, ily)   36.2   38.7   41.3   44.5    55.6   100
 vacc5(age, female, ily)   29.3   31.3   34.0   36.4    52.1   100

Unsurprisingly the R loop (vacc1) is very slow, and proper
vectorisation speeds it up immensely.  Interestingly, however, the C++
loop still does considerably better (about 10x faster) - I'm not sure
exactly why this is the case, but I suspect it may be because it
avoids the many intermediate vectors that R requires.  The sugar
version is about half as fast, but this gets quite a bit faster with
explicit no missing flags.

I'd love any feedback on my code (https://gist.github.com/4111256) -
please let me know if I've missed anything obvious.

Hadley

-- 
RStudio / Rice University
http://had.co.nz/


More information about the Rcpp-devel mailing list