[Rcpp-devel] Sugar seems slower than Rcpp.
Dirk Eddelbuettel
edd at debian.org
Wed Jan 5 14:01:26 CET 2011
On 5 January 2011 at 10:55, Cedric Ginestet wrote:
| Dear All,
|
| Here are some simulations that I have run this morning. Romain's suggestion to
| compute xV.size() before the loop and Douglas' idea of using accumulate appear
| to work best. However, both are substantially slower than the r-base function.
|
| I have also included two more versions: (i) one similar to Romain's but using
| pre-incrementation in the loop and (ii) one using the iterator in the loop.
| Another option may be to use the C++ boost library. I don't know if anyone on
| this list has experience with using boost.
|
| See the results of the simulations below (N=1000 data sets).
| Ced
|
| #####################################################################
| ## Functions.
| Summing1 <- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = sum(xV);
| return wrap(out);
| ',plugin="Rcpp")
| Summing2 <- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0;
| for(int i=0; i<xV.size(); i++) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| Summing3 <- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0; int N=xV.size();
| for(int i=0; i<N; i++) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| Summing4 <- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| return wrap(std::accumulate(xV.begin(), xV.end(), double()));
| ',plugin="Rcpp")
| Summing5 <- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0; int N=xV.size();
| for(int i=0; i<N; ++i) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| Summing6 <- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0;
| for(NumericVector::iterator i=xV.begin(); i!=xV.end(); ++i) out += *i;
| return wrap(out);
| ',plugin="Rcpp")
|
| #####################################################################
| ## Simulation: Time Testing.
| n <- 1000000; N <- 1000
| time.Sum <- matrix(0,N,7);
| for(i in 1:N){
| x <- rnorm(n)
| time.Sum[i,1] <- system.time(Summing1(x))[3];
| time.Sum[i,2] <- system.time(Summing2(x))[3];
| time.Sum[i,3] <- system.time(Summing3(x))[3];
| time.Sum[i,4] <- system.time(Summing4(x))[3];
| time.Sum[i,5] <- system.time(Summing5(x))[3];
| time.Sum[i,6] <- system.time(Summing6(x))[3];
| time.Sum[i,7] <- system.time(sum(x))[3];
| }# i
| time.df <- data.frame(time.Sum)
| names(time.df) <- c
| ("Sugar","Rcpp","Rcpp_N","Accumulate","Pre-increment","Iterator","R")
| boxplot(time.df)
|
| #####################################################################
| ## RESULTS:
| formatC(summary(time.df),dec=3)
| Sugar Rcpp Rcpp_N
| " Min. :0.01600 " " Min. :0.01000 " "Min. :0.005000 "
| " 1st Qu.:0.01600 " " 1st Qu.:0.01000 " "1st Qu.:0.005000 "
| " Median :0.01600 " " Median :0.01100 " "Median :0.006000 "
| " Mean :0.01631 " " Mean :0.01060 " "Mean :0.005668 "
| " 3rd Qu.:0.01600 " " 3rd Qu.:0.01100 " "3rd Qu.:0.006000 "
| " Max. :0.03700 " " Max. :0.02400 " "Max. :0.020000 "
| Accumulate Pre-increment Iterator
| "Min. :0.005000 " "Min. :0.005000 " " Min. :0.01000 "
| "1st Qu.:0.005000 " "1st Qu.:0.005000 " " 1st Qu.:0.01000 "
| "Median :0.006000 " "Median :0.006000 " " Median :0.01100 "
| "Mean :0.005714 " "Mean :0.005697 " " Mean :0.01065 "
| "3rd Qu.:0.006000 " "3rd Qu.:0.006000 " " 3rd Qu.:0.01100 "
| "Max. :0.029000 " "Max. :0.021000 " " Max. :0.03100 "
| R
| "Min. :0.002000 "
| "1st Qu.:0.002000 "
| "Median :0.002000 "
| "Mean :0.002211 "
| "3rd Qu.:0.002000 "
| "Max. :0.004000 "
| #####################################################################
|
| PS: Apologies to Dirk as I have not followed his advice, yet.
Try this instead:
## Summing1 to Summing6 as above
Summing1a <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = sum(noNA(xV));
return wrap(out);
',plugin="Rcpp")
library(rbenchmark)
n <- 1000000
N <- 1000
x <- rnorm(n)
bm <- benchmark(Sugar = Summing1(x),
SugarNoNA = Summing1a(x),
Rcpp = Summing2(x),
Rcpp_N = Summing3(x),
Accumulate= Summing4(x),
PreIncrem = Summing5(x),
Iterator = Summing6(x),
R = function(x){ sum(x) },
columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
order="relative",
replications=N)
print(bm)
which on my box gets this
edd at max:/tmp$ Rscript cedric.R
Loading required package: methods
test elapsed relative user.self sys.self
8 R 0.003 1.00 0.00 0
5 Accumulate 1.212 404.00 1.22 0
2 SugarNoNA 1.214 404.67 1.22 0
6 PreIncrem 1.214 404.67 1.21 0
4 Rcpp_N 1.215 405.00 1.21 0
7 Iterator 5.301 1767.00 5.30 0
3 Rcpp 5.302 1767.33 5.30 0
1 Sugar 7.229 2409.67 7.21 0
edd at max:/tmp$
indicating that you have four equivalent versions neither on of which can go
as fast as an R builtin goes (well, doh).
Basic sugar, as we said before, gives a lot of convenience along with some
safeties (exception checks, NA checks, ...).
But you are not the first person, and surely not the last, to simply assume
that it would also be as fast as carefully tuned and crafted code.
But that ain't so -- the No Free Lunch theorem is still valid.
Dirk
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Rcpp-devel
mailing list