[Rcpp-devel] When does using iterators for subscripting help?

Sat Jan 7 15:51:47 CET 2012

I thought I could make a difference by trying some loop unrolling 
voodoo, with this macro:

#define LOOP_UNROLL(EXPR)           \
int __trip_count = n >> 2 ;                       \
int i = 0 ;                                       \
for ( ; __trip_count > 0 ; --__trip_count) {      \
     EXPR ; i++ ;                 \
     EXPR ; i++ ;                 \
     EXPR ; i++ ;                 \
     EXPR ; i++ ;                 \
}                                                 \
switch (n - i){                                   \
   case 3:                                         \
       EXPR ; i++ ;               \
   case 2:                                         \
       EXPR ; i++ ;               \
   case 1:                                         \
       EXPR ; i++ ;               \
   case 0:                                         \
   default:                                        \
       {}                                          \
}

and these versions:

count_bin_unroll <- cxxfunction(signature(x = "numeric", binwidth =
"numeric", origin = "numeric", nbins = "integer"), '
   int nbins_ =  as<int>(nbins);
   double binwidth_ = as<double>(binwidth);
   double origin_ = as<double>(origin);

   Rcpp::NumericVector counts(nbins_);
   Rcpp::NumericVector x_(x);

   int n = x_.size();

   LOOP_UNROLL(( counts[(int) ((x_[i] - origin_) / binwidth_)]++ ))

   return counts;
', plugin = "Rcpp", includes = readLines( "loopunroll.h" ) )

count_bini_unroll <- cxxfunction(signature(x = "numeric", binwidth =
"numeric", origin = "numeric", nbins = "integer"), '
   int nbins_ =  as<int>(nbins);
   double binwidth_ = as<double>(binwidth);
   double origin_ = as<double>(origin);

   Rcpp::NumericVector counts(nbins_);
   Rcpp::NumericVector x_(x);

   int n = x_.size();

   Rcpp::NumericVector::iterator x_i = x_.begin();
   Rcpp::NumericVector::iterator counts_i = counts.begin();

   LOOP_UNROLL(( counts_i[(int) ( (x_i[i] - origin_) / binwidth_)]++ ))

   return counts;
', plugin = "Rcpp", includes = readLines( "loopunroll.h" ) )

But it turns out it does not help much. I get these:

Unit: milliseconds
              expr      min       lq   median       uq      max
1        iterator 31.26655 31.52351 31.64775 31.81314  89.9450
2 iterator_unroll 32.12131 32.34650 32.48870 32.69931 109.7043
3        operator 33.35224 33.59498 33.73557 33.91297 126.1293
4 operator_unroll 33.62351 33.77934 33.93898 34.28303 118.9024

But I wanted to share it anyway.

Romain

Le 04/01/12 15:16, Hadley Wickham a écrit :
> Hi all,
>
> Slightly less dense question (hopefully).  In the code below I have
> two versions of the same function - one uses operator[] and the other
> uses iterators.  Following the Rcpp introduction, I had expected the
> iterator version to be substantially faster, but I'm only seeing a
> minor improvement (~10%).  Why doesn't using iterators help me much
> here?  Possible explanations:
>
> * I'm using iterators incorrectly in my code
>
> * Iterators help most when the vector access is sequential, and here
> the counts index is bouncing all over the place, so I shouldn't expect
> much improvement.
>
> Any ideas would be much appreciated.  Thanks!
>
> Hadley
>
>
> library(inline)
>
> count_bin<- cxxfunction(signature(x = "numeric", binwidth =
> "numeric", origin = "numeric", nbins = "integer"), '
>    int nbins_ =  as<int>(nbins);
>    double binwidth_ = as<double>(binwidth);
>    double origin_ = as<double>(origin);
>
>    Rcpp::NumericVector counts(nbins_);
>    Rcpp::NumericVector x_(x);
>
>    int n = x_.size();
>
>    for(int i = 0; i<  n; i++) {
>      counts[(int) ((x_[i] - origin_) / binwidth_)]++;
>    }
>
>    return counts;
> ', plugin = "Rcpp")
>
> count_bini<- cxxfunction(signature(x = "numeric", binwidth =
> "numeric", origin = "numeric", nbins = "integer"), '
>    int nbins_ =  as<int>(nbins);
>    double binwidth_ = as<double>(binwidth);
>    double origin_ = as<double>(origin);
>
>    Rcpp::NumericVector counts(nbins_);
>    Rcpp::NumericVector x_(x);
>
>    int n = x_.size();
>
>    Rcpp::NumericVector::iterator x_i = x_.begin();
>    Rcpp::NumericVector::iterator counts_i = counts.begin();
>
>    for(int i = 0; i<  n; i++) {
>      counts_i[(int) ((x_i[i] - origin_) / binwidth_)]++;
>    }
>
>    return counts;
> ', plugin = "Rcpp")
>
> x<- rnorm(1e7, sd = 3)
> origin<- min(x)
> binwidth<- 1
> n<- ceiling((max(x) - origin) / binwidth)
>
> system.time(y1<- count_bin(x, binwidth, origin, nbins = n))
> system.time(y2<- count_bini(x, binwidth, origin, nbins = n))
> all.equal(y1, y2)
>
> library(microbenchmark)
> microbenchmark(
>    operator = count_bin(x, binwidth, origin, nbins = n),
>    iterator = count_bini(x, binwidth, origin, nbins = n))
> )
>
> # The real reason I'm exploring this is as a more efficient version
> # of tabulate for doing equal bin counts.  The Rcpp version is about 10x
> # faster, mainly (I think) because it avoids creating a modified copy of the
> # vector
>
> system.time(y3<- tabulate((x - origin) / binwidth + 1, nbins = n))
> all.equal(y1, y3)
>

-- 
Romain Francois
Professional R Enthusiast
http://romainfrancois.blog.free.fr