[Rcpp-devel] speed
Richard Chandler
richard.chandlers at gmail.com
Wed Aug 4 22:49:51 CEST 2010
Doug and Dirk, thanks, those are extremely helpful suggestions.
As for cmin, that was a mistake on my part. It should have been: cmin
= seq_len(K[i]+1). I think I know why it worked both ways, but I need
to investigate.
Regarding the different results you got Dirk, I noticed that you
declared lambda as int instead of double. So now it works and is
faster!
Doug, I just received your code and will check it out soon. I'll let
the list know if I can further improve this.
Thanks again,
Richard
inc2 <- '
#include <Rmath.h>
NumericVector wrap_dbinom(IntegerVector c, int k, double f) {
NumericVector x(c.size());
for (int i=0; i<c.size(); i++) x[i] = dbinom(c[i], k, f, false);
return x;
}
NumericVector wrap_dpois(IntegerVector c, double l) {
NumericVector x(c.size());
for (int i=0; i<c.size(); i++) x[i] = dpois(c[i], l, false);
return x;
}
'
fx2 <- cxxfunction(signature(Kr="integer"), '
IntegerVector K(Kr);
NumericMatrix bpsum(K.size(), K.size());
for(int i=0; i<K.size(); i++) {
for(int j=0; j<K.size(); j++) {
IntegerVector cmin = seq_len(K[i]+1);
IntegerVector cmin0 = cmin-1;
NumericVector bin = wrap_dbinom(cmin0, K[j], 0.5);
NumericVector pois = wrap_dpois(K[j]-cmin0, 1.5);
NumericVector bp = bin * pois;
bpsum(i, j) = std::accumulate(bp.begin(), bp.end(), 0.0);
}
}
return bpsum;
', plugin="Rcpp", include=inc2)
all.equal(fx2(0:4), fx(0:4), fxR(0:4)) # TRUE
system.time(fx2(0:100)) # 0.92
system.time(fx(0:100)) # 2.2
system.time(fxR(0:100)) # 1.2
On Wed, Aug 4, 2010 at 4:41 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> Two other pieces of information. I profiled the execution of fxR and
> it does seem to be spending its time in dbinom and dpois.
>
>> summaryRprof("Rprof.out")
> $by.self
> self.time self.pct total.time total.pct
> "dbinom" 0.32 39.0 0.34 41.5
> "dpois" 0.28 34.1 0.30 36.6
> "fxR" 0.14 17.1 0.82 100.0
> ":" 0.04 4.9 0.04 4.9
> "sum" 0.04 4.9 0.04 4.9
> "str" 0.00 0.0 0.82 100.0
>
> $by.total
> total.time total.pct self.time self.pct
> "fxR" 0.82 100.0 0.14 17.1
> "str" 0.82 100.0 0.00 0.0
> "dbinom" 0.34 41.5 0.32 39.0
> "dpois" 0.30 36.6 0.28 34.1
> ":" 0.04 4.9 0.04 4.9
> "sum" 0.04 4.9 0.04 4.9
>
> $sampling.time
> [1] 0.82
>
> Like Dirk I was able to get about a 65-70% improvement in speed by
> calling the R API functions Rf_dbinom and Rf_dpois and unrolling the
> loops a bit (see enclosed). (Also my code seems to pass the test
> although that makes me wonder why the fx version also passes the
> test.)
>
> I would interpret the information from the profiling above to indicate
> that there are not big gains to be realized after this because so much
> of the elapsed time will be tied up in the dbinom and dpois
> evaluations.
>
> On Wed, Aug 4, 2010 at 3:10 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>>
>> Richard,
>>
>> Great question.
>>
>> One obvious consideration for improvement is to not call dbinom and dpois
>> from R, but rather using the C entry points provided by Rmath.h. With that, I
>> get these performances:
>>
>>> require(rbenchmark)
>> Loading required package: rbenchmark
>>> benchmark(fx(0:100),
>> + fx2(0:100),
>> + fxR(0:100))
>> test replications elapsed relative user.self
>> 1 fx(0:100) 100 89.426 3.811850 89.42
>> 2 fx2(0:100) 100 23.460 1.000000 23.46
>> 3 fxR(0:100) 100 41.699 1.777451 41.69
>>> # deleted empty sys.self, user,child, sys.child columns here
>>
>> The bad news is that I don't get the same results:
>>
>>> all.equal(fx(0:10), fxR(0:10))
>> [1] TRUE
>>> all.equal(fx2(0:10), fxR(0:10))
>> [1] "Mean relative difference: 0.2535309"
>>>
>>
>> But as Doug suggested, there may be a need for reworking things and checks
>> anyway. Or maybe I just introduced a bug -- dunno. I merely meant to help on
>> pointing out that Rmath.h is there too, and I did this as a quick and dirty
>> wrap around the 'atomistic' Rmath functions. Maybe someone would want to
>> contribute vectorised versions of these ? Patches welcome, as they say...
>>
>> Dirk
>>
>>
>> inc2 <- '
>> #include <Rmath.h>
>>
>> NumericVector wrap_dbinom(IntegerVector c, int k, double f) {
>> NumericVector x(c.size());
>> for (int i=0; i<c.size(); i++) x[i] = dbinom(c[i], k, f, false);
>> return x;
>> }
>>
>> NumericVector wrap_dpois(IntegerVector c, int l) {
>> NumericVector x(c.size());
>> for (int i=0; i<c.size(); i++) x[i] = dpois(c[i], l, false);
>> return x;
>> }
>> '
>>
>> fx2 <- cxxfunction(signature(Kr="integer"), '
>> IntegerVector K(Kr);
>> NumericMatrix bpsum(K.size(), K.size());
>> for(int i=0; i<K.size(); i++) {
>> for(int j=0; j<K.size(); j++) {
>> IntegerVector Ki = K[i];
>> IntegerVector cmin = seq_len(Ki.size()+1);
>> IntegerVector cmin0 = cmin-1;
>> NumericVector bin = wrap_dbinom(cmin0, K[j], 0.5);
>> NumericVector pois = wrap_dpois(K[j]-cmin0, 1.5);
>> NumericVector bp = bin * pois;
>> bpsum(i, j) = std::accumulate(bp.begin(), bp.end(), 0.0);
>> }
>> }
>> return bpsum;
>> ', plugin="Rcpp", include=inc2)
>>
>>
>>
>>
>> --
>> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
>> _______________________________________________
>> Rcpp-devel mailing list
>> Rcpp-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>
More information about the Rcpp-devel
mailing list