[Rcpp-devel] Most efficient way to apply function over rows of a matrix with Rcpp?

Romain Francois romain at r-enthusiasts.com
Wed Jan 30 13:33:35 CET 2013


Le 18/01/13 03:09, Kevin Ushey a écrit :
> Hi guys,
>
> I'm trying to use Rcpp to implement an 'apply'-type function for
> matrices, whereby I apply a function to each row or column of a matrix.
> I'm testing it here with the 'mean' just to see how well we can do vs.
> the highly optimized rowMeans, colMeans functions, but of course any
> other function taking a vector and returning a scalar might fit here.
>
> I'm wondering if I can do better. Please see the gist in the link here;
> you can sourceCpp it in R.
>
> https://gist.github.com/4561281
>
> I try to limit the amount of copying as much as possible with
> NumericMatrix::Column and NumericMatrix::Row.
>
> Note that the Rcpp solution over columns is just as fast as colMeans,
> but over rows it's a fair bit slower relative to rowMeans. Is there any
> way I could improve this?
>
> I plan to submit an expanded version of this to the Rcpp gallery so any
> advice is appreciated!
>
> Thanks,
> -Kevin

The use of the NumericMatrix::Row has some cost, each time we access 
tmp[i], we must calculate the right offset :

inline int offset( int i, int j) const { return i + nrows * j ; }

You could implement it yourself taking into account the structure of the 
matrix:

// [[Rcpp::export]]
NumericVector row_means_2( NumericMatrix& X ) {

   int nRows = X.nrow(), nCols = X.ncol() ;
   NumericVector out = no_init(nRows);

   for( int j=0, k=0; j < nCols; j++ ) {
     for( int i=0; i < nRows; i++, k++ ) {
       out[i] += X[k] ;
     }
   }
   for( int i =0; i<nRows; i++) out[i] /= nCols ;

   return out;

}

With this I get:

 > benchmark(replications = 5, order = NULL, apply(x,
+     1, mean), rowMeans(x), row_means(x), row_means_2(x))
                test replications elapsed  relative user.self sys.self
1 apply(x, 1, mean)            5   0.266 24.181818     0.253    0.013
2       rowMeans(x)            5   0.012  1.090909     0.011    0.000
3      row_means(x)            5   0.041  3.727273     0.041    0.000
4    row_means_2(x)            5   0.011  1.000000     0.011    0.000
   user.child sys.child
1          0         0
2          0         0
3          0         0
4          0         0


Maybe there is something we can do at our end to improve this by making 
a more efficient iterator over a matrix row.

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible



More information about the Rcpp-devel mailing list