[Rcpp-devel] Sugar seems slower than Rcpp.

Dirk Eddelbuettel edd at debian.org
Wed Jan 5 14:01:26 CET 2011


On 5 January 2011 at 10:55, Cedric Ginestet wrote:
| Dear All,
| 
| Here are some simulations that I have run this morning. Romain's suggestion to
| compute xV.size() before the loop and Douglas' idea of using accumulate appear
| to work best. However, both are substantially slower than the r-base function.
| 
| I have also included two more versions: (i) one similar to Romain's but using
| pre-incrementation in the loop and (ii) one using the iterator in the loop.
| Another option may be to use the C++ boost library. I don't know if anyone on
| this list has experience with using boost.
| 
| See the results of the simulations below (N=1000 data sets).
| Ced
| 
| #####################################################################
| ## Functions.
| Summing1 <- cxxfunction(signature(x="numeric"), '
|       NumericVector xV(x);
|       double out = sum(xV);
|       return wrap(out);
| ',plugin="Rcpp")
| Summing2 <- cxxfunction(signature(x="numeric"), '
|       NumericVector xV(x);
|       double out = 0.0;
|       for(int i=0; i<xV.size(); i++) out += xV[i]; 
|       return wrap(out);
| ',plugin="Rcpp")
| Summing3 <- cxxfunction(signature(x="numeric"), '
|       NumericVector xV(x);
|       double out = 0.0; int N=xV.size();
|       for(int i=0; i<N; i++) out += xV[i]; 
|       return wrap(out);
| ',plugin="Rcpp")
| Summing4 <- cxxfunction(signature(x="numeric"), '
|       NumericVector xV(x);
|       return wrap(std::accumulate(xV.begin(), xV.end(), double()));
| ',plugin="Rcpp")
| Summing5 <- cxxfunction(signature(x="numeric"), '
|       NumericVector xV(x);
|       double out = 0.0; int N=xV.size();
|       for(int i=0; i<N; ++i) out += xV[i]; 
|       return wrap(out);
| ',plugin="Rcpp")
| Summing6 <- cxxfunction(signature(x="numeric"), '
|       NumericVector xV(x);
|       double out = 0.0;
|       for(NumericVector::iterator i=xV.begin(); i!=xV.end(); ++i) out += *i; 
|       return wrap(out);
| ',plugin="Rcpp")
| 
| #####################################################################
| ## Simulation: Time Testing.
| n <- 1000000; N <- 1000
| time.Sum <- matrix(0,N,7);
| for(i in 1:N){
| x <- rnorm(n)
| time.Sum[i,1] <- system.time(Summing1(x))[3]; 
| time.Sum[i,2] <- system.time(Summing2(x))[3];
| time.Sum[i,3] <- system.time(Summing3(x))[3]; 
| time.Sum[i,4] <- system.time(Summing4(x))[3];
| time.Sum[i,5] <- system.time(Summing5(x))[3];
| time.Sum[i,6] <- system.time(Summing6(x))[3];
| time.Sum[i,7] <- system.time(sum(x))[3]; 
| }# i
| time.df <- data.frame(time.Sum)
| names(time.df) <- c
| ("Sugar","Rcpp","Rcpp_N","Accumulate","Pre-increment","Iterator","R")
| boxplot(time.df)
| 
| #####################################################################
| ## RESULTS:
| formatC(summary(time.df),dec=3)
|      Sugar                 Rcpp                Rcpp_N         
|  " Min.   :0.01600  " " Min.   :0.01000  " "Min.   :0.005000  "
|  " 1st Qu.:0.01600  " " 1st Qu.:0.01000  " "1st Qu.:0.005000  "
|  " Median :0.01600  " " Median :0.01100  " "Median :0.006000  "
|  " Mean   :0.01631  " " Mean   :0.01060  " "Mean   :0.005668  "
|  " 3rd Qu.:0.01600  " " 3rd Qu.:0.01100  " "3rd Qu.:0.006000  "
|  " Max.   :0.03700  " " Max.   :0.02400  " "Max.   :0.020000  "
|    Accumulate         Pre-increment           Iterator        
|  "Min.   :0.005000  " "Min.   :0.005000  " " Min.   :0.01000  "
|  "1st Qu.:0.005000  " "1st Qu.:0.005000  " " 1st Qu.:0.01000  "
|  "Median :0.006000  " "Median :0.006000  " " Median :0.01100  "
|  "Mean   :0.005714  " "Mean   :0.005697  " " Mean   :0.01065  "
|  "3rd Qu.:0.006000  " "3rd Qu.:0.006000  " " 3rd Qu.:0.01100  "
|  "Max.   :0.029000  " "Max.   :0.021000  " " Max.   :0.03100  "
|        R            
|  "Min.   :0.002000  "
|  "1st Qu.:0.002000  "
|  "Median :0.002000  "
|  "Mean   :0.002211  "
|  "3rd Qu.:0.002000  "
|  "Max.   :0.004000  "
| #####################################################################
| 
| PS: Apologies to Dirk as I have not followed his advice, yet.

Try this instead:

    ## Summing1 to Summing6 as above
  
    Summing1a <- cxxfunction(signature(x="numeric"), '
          NumericVector xV(x);
          double out = sum(noNA(xV));
          return wrap(out);
    ',plugin="Rcpp")
   
    library(rbenchmark)
    n <- 1000000
    N <- 1000
    x <- rnorm(n)
    
    bm <- benchmark(Sugar     = Summing1(x),
                    SugarNoNA = Summing1a(x),
                    Rcpp      = Summing2(x),
                    Rcpp_N    = Summing3(x),
                    Accumulate= Summing4(x),
                    PreIncrem = Summing5(x),
                    Iterator  = Summing6(x),
                    R         = function(x){ sum(x) },
                    columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
                    order="relative",
                    replications=N)
    print(bm)
  
which on my box gets this

    edd at max:/tmp$ Rscript cedric.R 
    Loading required package: methods
            test elapsed relative user.self sys.self
    8          R   0.003     1.00      0.00        0
    5 Accumulate   1.212   404.00      1.22        0
    2  SugarNoNA   1.214   404.67      1.22        0
    6  PreIncrem   1.214   404.67      1.21        0
    4     Rcpp_N   1.215   405.00      1.21        0
    7   Iterator   5.301  1767.00      5.30        0
    3       Rcpp   5.302  1767.33      5.30        0
    1      Sugar   7.229  2409.67      7.21        0
    edd at max:/tmp$ 

indicating that you have four equivalent versions neither on of which can go
as fast as an R builtin goes (well, doh).

Basic sugar, as we said before, gives a lot of convenience along with some
safeties (exception checks, NA checks, ...).  

But you are not the first person, and surely not the last, to simply assume
that it would also be as fast as carefully tuned and crafted code.  

But that ain't so -- the No Free Lunch theorem is still valid.

Dirk

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com


More information about the Rcpp-devel mailing list