[Rcpp-devel] R vectorisation vs. C++ vectorisation

Wed Nov 21 12:56:07 CET 2012

Le 19/11/12 17:47, Douglas Bates a écrit :
> On Mon, Nov 19, 2012 at 9:56 AM, Dirk Eddelbuettel <edd at debian.org
> <mailto:edd at debian.org>> wrote:
>
>
>     On 19 November 2012 at 09:31, Hadley Wickham wrote:
>     | Hi all,
>     |
>     | Inspired by "Rcpp is smoking fast for agent-based models in data
>     | frames" (http://www.babelgraph.org/wp/?p=358), I've been doing some
>
>     [ I liked that post, but we got flak afterwards as his example was
>     not well
>     chosen. The illustration of the language speed difference does of course
>     hold. ]
>
>     | exploration of vectorisation in R vs C++ at
>     | https://gist.github.com/4111256
>     |
>     | I have five versions of the basic vaccinate function:
>     |
>     | * vacc1: vectorisation in R with a for loop
>     | * vacc2: used vectorised R primitives
>     | * vacc3: vectorised with loop in C++
>     | * vacc4: vectorised with Rcpp sugar
>     | * vacc5: vectorised with Rcpp sugar, explicitly labelled as containing
>     | no missing values
>     |
>     | And the timings I get are as follows:
>     |
>     | Unit: microseconds
>     |                     expr    min     lq median     uq     max neval
>     |  vacc1(age, female, ily) 6816.8 7139.4 7285.7 7823.9 10055.5   100
>     |  vacc2(age, female, ily)  194.5  202.6  212.6  227.9   260.4   100
>     |  vacc3(age, female, ily)   21.8   22.4   23.4   24.9    35.5   100
>     |  vacc4(age, female, ily)   36.2   38.7   41.3   44.5    55.6   100
>     |  vacc5(age, female, ily)   29.3   31.3   34.0   36.4    52.1   100
>     |
>     | Unsurprisingly the R loop (vacc1) is very slow, and proper
>     | vectorisation speeds it up immensely.  Interestingly, however, the C++
>     | loop still does considerably better (about 10x faster) - I'm not sure
>     | exactly why this is the case, but I suspect it may be because it
>     | avoids the many intermediate vectors that R requires.  The sugar
>     | version is about half as fast, but this gets quite a bit faster with
>     | explicit no missing flags.
>     |
>     | I'd love any feedback on my code (https://gist.github.com
>     <https://gist.github.com/4111256>
>
>     /4111256 <https://gist.github.com/4111256>) -
>     | please let me know if I've missed anything obvious.
>
>     I don't have a problem with sugar being a little slower that
>     hand-rolling.
>     The code is so much simpler and shorter. And we're still way faster than
>     vectorised R.  I like that place.
>
>     Somewhat off-topic/on-topic: I am still puzzled by how the Julia
>     guys now
>     revert back from vectorised code to hand-written loops because llvm does
>     better on those.  Speed is good, but concise code with speed is
>     better in my
>     book.
>
>
> Sigh.  Speaking as one of the "Julia guys" I should point out two things
> (not that they will change Dirk's "cold, dead hands" attitude towards
> Julia :-)

That might however raise the interest of some other Rcpp author ^^

> 1. Comprehensions provide what I feel is a clean syntax for sugar-like
> operations in Julia
>
> 2. A problem with vectorization is the issue of multiple loops, hence
> the number of attempts at implementing delayed evaluation in compiled
> code (Eigen) and in add-on's to R.
>
> A translation of Hadley's vacc3 into Julia could be
>
> function vacc3a(age::Float64, female::Bool, ily::Float64){
>    p = 0.25 + 0.3 * 1 / (1 - exp(0.04 * age)) + 0.1 * ily
>    p *= female ? 1.25 : 0.75
>    min(max(0., p), 1.)
> }
>
> out = [vacc3a(age[i], female[i], ily[i]) for i in 1:length(age)]
>
> The comprehension collapses the
>   1. Determine the length of the output vector
>   2. Allocate the result
>   3. Loop over indices populating the result
>   4. Return the result
>
> to a single syntactic element that, in my opinion, is quite readable.

Yes. I'm looking into mapply so that we could do e.g.

NumericVector p = mapply( age, female, ily, fun )

where fun is a function object with the correct signature, dealing with 
individual elements. Similar idea.

>     Hence I would prefer to invoke the 80/20 rule as I think we have better
>     targets to chase than to narrow that gap. But that's just my $0.02...
>
>     If you can't sleep til both version have 20-some microsend medians
>     then by
>     all means go crazy ;-)
>
>     Dirk

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com
`- http://bit.ly/SweN1Z : SuperStorm Sandy

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible