[Rcpp-devel] speed

Wed Aug 4 22:49:51 CEST 2010

Doug and Dirk, thanks, those are extremely helpful suggestions.

As for cmin, that was a mistake on my part. It should have been: cmin
= seq_len(K[i]+1). I think I know why it worked both ways, but I need
to investigate.

Regarding the different results you got Dirk, I noticed that you
declared lambda as int instead of double. So now it works and is
faster!

Doug, I just received your code and will check it out soon. I'll let
the list know if I can further improve this.

Thanks again,
Richard

inc2 <- '
#include <Rmath.h>

NumericVector wrap_dbinom(IntegerVector c, int k, double f) {
 NumericVector x(c.size());
 for (int i=0; i<c.size(); i++) x[i] = dbinom(c[i], k, f, false);
 return x;
}

NumericVector wrap_dpois(IntegerVector c, double l) {
 NumericVector x(c.size());
 for (int i=0; i<c.size(); i++) x[i] = dpois(c[i], l, false);
 return x;
}
'

fx2 <- cxxfunction(signature(Kr="integer"), '
   IntegerVector K(Kr);
   NumericMatrix bpsum(K.size(), K.size());
   for(int i=0; i<K.size(); i++) {
       for(int j=0; j<K.size(); j++) {
           IntegerVector cmin = seq_len(K[i]+1);
           IntegerVector cmin0 = cmin-1;
           NumericVector bin = wrap_dbinom(cmin0, K[j], 0.5);
           NumericVector pois = wrap_dpois(K[j]-cmin0, 1.5);
           NumericVector bp = bin * pois;
           bpsum(i, j) = std::accumulate(bp.begin(), bp.end(), 0.0);
           }
       }
   return bpsum;
   ', plugin="Rcpp", include=inc2)

all.equal(fx2(0:4), fx(0:4), fxR(0:4)) # TRUE

system.time(fx2(0:100)) # 0.92
system.time(fx(0:100)) # 2.2
system.time(fxR(0:100)) # 1.2

On Wed, Aug 4, 2010 at 4:41 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> Two other pieces of information.  I profiled the execution of fxR and
> it does seem to be spending its time in dbinom and dpois.
>
>> summaryRprof("Rprof.out")
> $by.self
>         self.time self.pct total.time total.pct
> "dbinom"      0.32     39.0       0.34      41.5
> "dpois"       0.28     34.1       0.30      36.6
> "fxR"         0.14     17.1       0.82     100.0
> ":"           0.04      4.9       0.04       4.9
> "sum"         0.04      4.9       0.04       4.9
> "str"         0.00      0.0       0.82     100.0
>
> $by.total
>         total.time total.pct self.time self.pct
> "fxR"          0.82     100.0      0.14     17.1
> "str"          0.82     100.0      0.00      0.0
> "dbinom"       0.34      41.5      0.32     39.0
> "dpois"        0.30      36.6      0.28     34.1
> ":"            0.04       4.9      0.04      4.9
> "sum"          0.04       4.9      0.04      4.9
>
> $sampling.time
> [1] 0.82
>
> Like Dirk I was able to get about a 65-70% improvement in speed by
> calling the R API functions Rf_dbinom and Rf_dpois and unrolling the
> loops a bit (see enclosed).  (Also my code seems to pass the test
> although that makes me wonder why the fx version also passes the
> test.)
>
> I would interpret the information from the profiling above to indicate
> that there are not big gains to be realized after this because so much
> of the elapsed time will be tied up in the dbinom and dpois
> evaluations.
>
> On Wed, Aug 4, 2010 at 3:10 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>>
>> Richard,
>>
>> Great question.
>>
>> One obvious consideration for improvement is to not call dbinom and dpois
>> from R, but rather using the C entry points provided by Rmath.h. With that, I
>> get these performances:
>>
>>> require(rbenchmark)
>> Loading required package: rbenchmark
>>> benchmark(fx(0:100),
>> +           fx2(0:100),
>> +           fxR(0:100))
>>        test replications elapsed relative user.self
>> 1  fx(0:100)          100  89.426 3.811850     89.42
>> 2 fx2(0:100)          100  23.460 1.000000     23.46
>> 3 fxR(0:100)          100  41.699 1.777451     41.69
>>> # deleted empty sys.self, user,child, sys.child columns here
>>
>> The bad news is that I don't get the same results:
>>
>>> all.equal(fx(0:10), fxR(0:10))
>> [1] TRUE
>>> all.equal(fx2(0:10), fxR(0:10))
>> [1] "Mean relative difference: 0.2535309"
>>>
>>
>> But as Doug suggested, there may be a need for reworking things and checks
>> anyway. Or maybe I just introduced a bug -- dunno. I merely meant to help on
>> pointing out that Rmath.h is there too, and I did this as a quick and dirty
>> wrap around the 'atomistic' Rmath functions. Maybe someone would want to
>> contribute vectorised versions of these ?  Patches welcome, as they say...
>>
>> Dirk
>>
>>
>> inc2 <- '
>> #include <Rmath.h>
>>
>> NumericVector wrap_dbinom(IntegerVector c, int k, double f) {
>>  NumericVector x(c.size());
>>  for (int i=0; i<c.size(); i++) x[i] = dbinom(c[i], k, f, false);
>>  return x;
>> }
>>
>> NumericVector wrap_dpois(IntegerVector c, int l) {
>>  NumericVector x(c.size());
>>  for (int i=0; i<c.size(); i++) x[i] = dpois(c[i], l, false);
>>  return x;
>> }
>> '
>>
>> fx2 <- cxxfunction(signature(Kr="integer"), '
>>    IntegerVector K(Kr);
>>    NumericMatrix bpsum(K.size(), K.size());
>>    for(int i=0; i<K.size(); i++) {
>>        for(int j=0; j<K.size(); j++) {
>>            IntegerVector Ki = K[i];
>>            IntegerVector cmin = seq_len(Ki.size()+1);
>>            IntegerVector cmin0 = cmin-1;
>>            NumericVector bin = wrap_dbinom(cmin0, K[j], 0.5);
>>            NumericVector pois = wrap_dpois(K[j]-cmin0, 1.5);
>>            NumericVector bp = bin * pois;
>>            bpsum(i, j) = std::accumulate(bp.begin(), bp.end(), 0.0);
>>            }
>>        }
>>    return bpsum;
>>    ', plugin="Rcpp", include=inc2)
>>
>>
>>
>>
>> --
>> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
>> _______________________________________________
>> Rcpp-devel mailing list
>> Rcpp-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>