[Rcpp-devel] R vectorisation vs. C++ vectorisation

Mon Nov 19 16:56:36 CET 2012

On 19 November 2012 at 09:31, Hadley Wickham wrote:
| Hi all,
| 
| Inspired by "Rcpp is smoking fast for agent-based models in data
| frames" (http://www.babelgraph.org/wp/?p=358), I've been doing some

[ I liked that post, but we got flak afterwards as his example was not well
chosen. The illustration of the language speed difference does of course
hold. ]

| exploration of vectorisation in R vs C++ at
| https://gist.github.com/4111256
| 
| I have five versions of the basic vaccinate function:
| 
| * vacc1: vectorisation in R with a for loop
| * vacc2: used vectorised R primitives
| * vacc3: vectorised with loop in C++
| * vacc4: vectorised with Rcpp sugar
| * vacc5: vectorised with Rcpp sugar, explicitly labelled as containing
| no missing values
| 
| And the timings I get are as follows:
| 
| Unit: microseconds
|                     expr    min     lq median     uq     max neval
|  vacc1(age, female, ily) 6816.8 7139.4 7285.7 7823.9 10055.5   100
|  vacc2(age, female, ily)  194.5  202.6  212.6  227.9   260.4   100
|  vacc3(age, female, ily)   21.8   22.4   23.4   24.9    35.5   100
|  vacc4(age, female, ily)   36.2   38.7   41.3   44.5    55.6   100
|  vacc5(age, female, ily)   29.3   31.3   34.0   36.4    52.1   100
| 
| Unsurprisingly the R loop (vacc1) is very slow, and proper
| vectorisation speeds it up immensely.  Interestingly, however, the C++
| loop still does considerably better (about 10x faster) - I'm not sure
| exactly why this is the case, but I suspect it may be because it
| avoids the many intermediate vectors that R requires.  The sugar
| version is about half as fast, but this gets quite a bit faster with
| explicit no missing flags.
| 
| I'd love any feedback on my code (https://gist.github.com/4111256) -
| please let me know if I've missed anything obvious.

I don't have a problem with sugar being a little slower that hand-rolling.
The code is so much simpler and shorter. And we're still way faster than
vectorised R.  I like that place.  

Somewhat off-topic/on-topic: I am still puzzled by how the Julia guys now
revert back from vectorised code to hand-written loops because llvm does
better on those.  Speed is good, but concise code with speed is better in my
book. 

Hence I would prefer to invoke the 80/20 rule as I think we have better
targets to chase than to narrow that gap. But that's just my $0.02...  

If you can't sleep til both version have 20-some microsend medians then by
all means go crazy ;-)

Dirk

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com