[Rcpp-devel] Using pointers with Numeric Vectors

Dirk Eddelbuettel edd at debian.org
Thu Dec 20 20:21:00 CET 2012


On 20 December 2012 at 12:35, Alon Honig wrote:
| My current RCPP program is only 4 times faster than its R byte code compiled
| equivalent. I am trying to speed up my code and from what I understand using
| pointers prevents an external function from copying the entire object when
| executing its methods.   I am having difficulty finding salient examples that
| use points and NumericVectors. Please tell me how in this same code I could get
| the functions "get_sum" and "get_var" to point to the NumericVector object "v"
| rather than copy the whole thing.

Sometimes it helps to think through the problem. Do you really need two
external function calls in the internal loop?  And if you do, and know
sufficient C++, why not declare them inline?

Next, current Rcpp versions also have var() as a sugar operation. So it's
worth looking at that.

Finally, consider rbenchmark or microbenchmark.

Hence I get the following -- by rewriting your program a single sugar call:

R> library(Rcpp)
R> a <- 1:1000000
R> cppFunction('double rcppVar(NumericVector x) { return var(x); }')
R> library(rbenchmark)
R> res <- benchmark(var(a), rcppVar(a))
R> res[,1:4]
        test replications elapsed relative
2 rcppVar(a)          100   4.641    1.000
1     var(a)          100   8.711    1.877
R> res <- benchmark(var(a), rcppVar(a))
R> res[,1:4]
        test replications elapsed relative
2 rcppVar(a)          100   3.949    1.000
1     var(a)          100   8.102    2.052
R> 

So that is around 1.8 to 2.0 times as fast as the vectorised R code, which
may have to do some more marshalling and sanity checks.

Hope this helps, Dirk

Dirk

| 
| 
|     library(inline)
|     library(Rcpp)
|      a=1:1000000
|     Rcpp.var = cxxfunction(signature(input="numeric"), plugin="Rcpp",
|     body="
| 
|          NumericVector v = input; 
| 
|          int n = v.size();
| 
|          double v_mean =  get_sum(v,n)/n;
| 
|          double v_var = get_var(v,v_mean,n);
| 
|          return wrap(v_var);
| 
| 
|         ",includes="
| 
| 
|          double get_var(NumericVector v,double m,int l)
| 
|          {double a = 0;
| 
|              for (int i = 0; i <l;i++)
| 
|         {a += (v[i]-m)*(v[i]-m);}
| 
|         return(a/l);
| 
|          }
| 
| 
| 
|          double get_sum(NumericVector v,int l)
| 
|          { double s = 0;
| 
|            for (int i = 0; i <l;i++)
| 
|         {s += v[i];}
| 
|         return(s);
| 
|          }
| 
|      ")
| 
|      b=system.time(for (i in 1:100)Rcpp.var (a))
|      c= system.time(for (i in 1:100)var (a))
| 
| 
| 
| Thank you Alon.
| 
| P.S. I am aware that the "get_var" function provides the population variance
| and not the sample variance.
| 
| 
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  


More information about the Rcpp-devel mailing list