[Rcpp-devel] How much speedup for matrix operations?

Douglas Bates bates at stat.wisc.edu
Wed Nov 6 18:57:00 CET 2013


By default Eigen does not use BLAS, which can be good or bad, depending on
the situation.  I notice that the second largest total time is spent in
t.default which may mean that you are using an operation like

t(X) %*% X

If so, you can save yourself time by using the crossprod or tcrossprod
functions.  For example, the expression above is more cleanly written as

crossprod(X)

or, for t(X) %*% Y,

crossprod(X, Y)

Eigen or Armadillo could help you avoid making unnecessary copied but if
your calculation does end up being dominated by matrix multiplications you
can't expect to gain much speed relative to R.  You may want to check the
type of BLAS you are using.  For Intel processors MKL is generally the
fastest (but proprietary) with OpenBLAS in second place.

I see that I am, as often happens, giving you similar advice to Dirk's
response.


On Wed, Nov 6, 2013 at 11:35 AM, Xavier Robin <xavier at cbs.dtu.dk> wrote:

> Hi,
>
> I have a pure-R code that spends most of the time performing vector and
> matrix operations, as shown by the summary of Rprof:
>
>>                    self.time self.pct total.time total.pct
>> "%*%"                 903.24    77.67     903.24 77.67
>> "t.default"            76.26     6.56      76.26 6.56
>> "-"                    36.60     3.15      36.60 3.15
>> "+"                    24.44     2.10      24.44 2.10
>> "/"                    24.22     2.08      24.22 2.08
>> "exp"                  20.26     1.74      20.26 1.74
>> "predict.myClass"      17.68     1.52     503.82 43.32
>> "*"                    11.90     1.02      11.90 1.02
>> "t"                     9.38     0.81     811.94 69.82
>> "update.batch"          8.04     0.69     654.68     56.30
>> ...
>>
> So mostly matrix %*% matrix multiplications, transpositions, vector +-/*
> matrix operations and exponentiations, representing >95% of the computation
> time.
> I have very few loops and if/else blocks.
>
> I want to speed up this code, and I am considering reimplementing it (or
> part of it) with RcppEigen or RcppArmadillo.
>
> However, I read that both Eigen and Amarillo use the underlying BLAS, like
> R.
> My question is, can I expect any significant speed-up from an Rcpp
> re-implementation in this case, given it is already mostly matrix algebra
> (which are supposed to be pretty efficient in R)?
>
> Thanks,
> Xavier
>
> --
> Xavier Robin, PhD
> Cellular Signal Integration Group (C-SIG) - http://www.lindinglab.org
> Center for Biological Sequence Analysis (CBS) - http://www.cbs.dtu.dk
> Department of Systems Biology - Technical University of Denmark (DTU)
> Anker Engelundsvej, Building 301, DK-2800 Lyngby, DENMARK.
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20131106/2f51bfde/attachment.html>


More information about the Rcpp-devel mailing list