[Rcpp-devel] How much speedup for matrix operations?

Dirk Eddelbuettel edd at debian.org
Wed Nov 6 18:53:26 CET 2013


On 6 November 2013 at 18:35, Xavier Robin wrote:
| I have a pure-R code that spends most of the time performing vector and 
| matrix operations, as shown by the summary of Rprof:
| >                    self.time self.pct total.time total.pct
| > "%*%"                 903.24    77.67     903.24 77.67
| > "t.default"            76.26     6.56      76.26 6.56
| > "-"                    36.60     3.15      36.60 3.15
| > "+"                    24.44     2.10      24.44 2.10
| > "/"                    24.22     2.08      24.22 2.08
| > "exp"                  20.26     1.74      20.26 1.74
| > "predict.myClass"      17.68     1.52     503.82 43.32
| > "*"                    11.90     1.02      11.90 1.02
| > "t"                     9.38     0.81     811.94 69.82
| > "update.batch"          8.04     0.69     654.68     56.30
| > ...
| So mostly matrix %*% matrix multiplications, transpositions, vector +-/* 
| matrix operations and exponentiations, representing >95% of the 
| computation time.
| I have very few loops and if/else blocks.
| 
| I want to speed up this code, and I am considering reimplementing it (or 
| part of it) with RcppEigen or RcppArmadillo.
| 
| However, I read that both Eigen and Amarillo use the underlying BLAS, 
| like R.
| My question is, can I expect any significant speed-up from an Rcpp 
| re-implementation in this case, given it is already mostly matrix 
| algebra (which are supposed to be pretty efficient in R)?

I think you already answered your question. :)

In the very narrow sense you cannot gain much as the actual matrix
multiplication is done by both R and Armadillo via a call to the _same_ BLAS
routine dgemm3.  (Eigen is special as it bypasses BLAS, which they have found
to be helpful. But I also had cases where Armadillo was faster than Eigen; it
all depends.)

You best best is to get a faster BLAS library: OpenBLAS is good and free,
Intel MKL is another choice, Atlas is a longtime favorite.

In the wider sense you can gain by not doing all the related ops of setting
up the matrices at the compiled layer.

But at the end for real answers just try and profile. 

Dirk

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com


More information about the Rcpp-devel mailing list