[Rcpp-devel] When does using iterators for subscripting help?

Douglas Bates bates at stat.wisc.edu
Fri Jan 6 18:39:37 CET 2012


2012/1/4 Hadley Wickham <hadley at rice.edu>:
>> NumericVector:::iterator is actually alias to double*. Here is a trick
>> (probably does not work on not gcc compilers):
>
> Ah, interesting - thanks!

I'm coming late to the party but ...  I was able to squeeze a couple
of milliseconds from the computation by expressing the counts as
integers and ensuring that I did not copy x by using an Eigen::Map.
It may well be that an Rcpp::NumericVector will not be a copy in this
case but I have found it difficult to determine exactly when an Rcpp
vector is going to be copied.  With Eigen I can ensure that the
original storage in R will be used for the vector.

count_eigen <- cxxfunction(signature(x="numeric", binwidth="numeric",
                                     origin="numeric", nbins="integer"), '
double binwidth_ = ::Rf_asReal(binwidth), origin_ = ::Rf_asReal(origin);

Eigen::VectorXi counts(::Rf_asInteger(nbins));
Eigen::Map<Eigen::VectorXd> x_(as<Eigen::Map<Eigen::VectorXd> >(x));

int n = x_.size();

counts.setZero();
for (int i = 0; i < n; i++) counts[((x_[i] - origin_) / binwidth_)]++;

return wrap(counts);
', plugin="RcppEigen")


As you see, I use ::Rf_asReal() and ::Rf_asInteger() for conversion to
scalar doubles or scalar integers.  Those functions are part of the R
API and are fast and general.

With this version I get a minor speedup

> microbenchmark(operator = count_bin(x, binwidth, origin, nbins = n),
+                iterator = count_bini(x, binwidth, origin, nbins = n),
+                eigen    = count_eigen(x, binwidth, origin, nbins = n)
+ )
Unit: milliseconds
      expr      min       lq   median       uq      max
1    eigen 145.6456 145.7102 145.7536 145.8210 151.0456
2 iterator 153.3059 153.3603 153.4182 153.6056 155.3246
3 operator 156.2418 156.7063 156.7637 156.8982 159.8635


More information about the Rcpp-devel mailing list