[Rcpp-devel] R vectorisation vs. C++ vectorisation

Thu Nov 22 00:51:03 CET 2012

>> By "sugar versions" I meant vacc4() vs. vacc3()
>> (https://gist.github.com/4111256) not pmin() and friends. The vacc4()
>> code looks like:
>>
>>    NumericVector p(age.size());
>>    p = 0.25 + 0.3 * 1 / (1 - exp(0.04 * age)) + 0.1 * ily;
>>    p = p * ifelse(female, 1.25, 0.75);
>>    p = pmax(0,p);
>>    p = pmin(1,p);
>>
>> Each operation copies the whole NumericVector each time, each of which
>> needs a memory allocation.
> 
> Nope. The NumericVector::operator=( sugar expression ) is used and
> memory for p is allocated just once, when p is constructed with this lne:
> 
>  NumericVector p(age.size());
> 
> None of the "p = ..." lines allocate memory for p.

Thanks for the correction Romain. That is very interesting, almost
amazing... [time passes] ... Oh, I get it!

>> vacc3a() does the same pipeline of operations on a single double, which
>> is therefore likely to be a single CPU register, and the whole of
>> vacc3a() will be inlined in vacc3().

Like a man clinging to the wreckage, I stand by my theory that vacc3()
will always be quicker than vacc4(). Each operation on p requires
running through memory, doing a fetch, then the calculations, then a
memory write. CPU cache and pre-fetch help, but you can't completely
eliminate this overhead (AFAIK).

Of course it is going to be minor, especially if doing large amounts of
calculations relative to the number of memory-accessing operations.

Your mapply(...,FUNCTOR) version, on the other hand, should be able to
be as quick as vacc3(), shouldn't it?

Darren

-- 
Darren Cook, Software Researcher/Developer

http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)