[Rcpp-devel] Differences between RcppEigen and RcppArmadillo

Dirk Eddelbuettel edd at debian.org
Thu Jun 14 16:11:45 CEST 2012


On 14 June 2012 at 08:01, Douglas Bates wrote:
| On Wed, Jun 13, 2012 at 6:53 PM, Julian Smith <hilbertspaces at gmail.com> wrote:
| > Doesn't svd in R by default compute D, U and V?
| 
| > http://stat.ethz.ch/R-manual/R-patched/library/base/html/svd.html
| 
| You're right but the default is the 'thin' U when X is n by p and n >=
| p.  Does the svd in Armadillo return the full n by n matrix U?

Thanks for that earlier hint re 'thin' and 'full' SVDs.
 
| Another thing to check is what underlying Lapack routine is called.
| There are two: dgesvd, which is the older algorithm and the one with
| the expected name if you know the Lapack conventions, and dgesdd which
| is a newer and faster algorithm.  R uses dgesdd.

Good call. 

It looks like Armadillo uses dgesv(d), grep did not find dgesdd it seems: :

edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ grep dges *
atlas_bones.hpp:    int wrapper_clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, double *A, const int lda, int *ipiv, double *B, const int ldb);
atlas_wrapper.hpp:      return arma_atlas(clapack_dgesv)(Order, N, NRHS, (T*)A, lda, ipiv, (T*)B, ldb);
glue_histc_meat.hpp:    "histc(): parameter 'edges' must be a vector"
lapack_bones.hpp:  #define arma_dgesvd dgesvd
lapack_bones.hpp:  #define arma_dgesv  dgesv
lapack_bones.hpp:  #define arma_dgesvd DGESVD
lapack_bones.hpp:  #define arma_dgesv  DGESV
lapack_bones.hpp:  void arma_fortran(arma_dgesvd)(char* jobu, char* jobvt, blas_int* m, blas_int* n, double* a, blas_int* lda, double* s, double* u, blas_int* ldu, double* vt, blas_int* ldvt, double* work, blas_int* lwork, blas_int* info);
lapack_bones.hpp:  void arma_fortran(arma_dgesv)(blas_int* n, blas_int* nrhs, double* a, blas_int* lda, blas_int* ipiv, double* b, blas_int* ldb, blas_int* info);
lapack_wrapper.hpp:      arma_fortran(arma_dgesvd)(jobu, jobvt, m, n, (T*)a, lda, (T*)s, (T*)u, ldu, (T*)vt, ldvt, (T*)work, lwork, info);
lapack_wrapper.hpp:      arma_fortran(arma_dgesv)(n, nrhs, (T*)a, lda, ipiv, (T*)b, ldb, info);
edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ 

Conrad, any interest in switching to dgesdd? 

Dirk, at useR and across the room from Doug
 
| > On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
| >>
| >> On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >> >
| >> > On 13 June 2012 at 15:05, Julian Smith wrote:
| >> > | I agree that RcppEigen is a little bit faster, but ease of use is
| >> > important to
| >> > | me, so I feel like RcppArmadillo might win out in my application.
| >> >
| >> > Yup, that my personal view too.
| >> >
| >> > | | RcppArmadillo will use the very same LAPACK and BLAS libs your R
| >> > session
| >> > | | uses. So MKL, OpenBlas, ... are all options.  Eigen actually has its
| >> > own
| >> > | code
| >> > | | outperforming LAPACK, so it doesn't  as much there.
| >> > |
| >> > | Why do you think R outperforms RcppArmadillo in this example below?
| >> > Anyway to
| >> > | speed this up?
| >> >
| >> > That is odd. "I guess it shouldn't." I shall take another look -- as I
| >> > understand it both should go to the same underlying Lapack routine.  I
| >> > may
| >> > have to consult with Conrad on this.
| >> >
| >> > Thanks for posting a full and reproducible example!
| >> >
| >> > Dirk
| >> >
| >> > | require(RcppArmadillo)
| >> > | require(inline)
| >> > |
| >> > | arma.code <- '
| >> > |   using namespace arma;
| >> > |   NumericMatrix Xr(Xs);
| >> > |   int n = Xr.nrow(), k = Xr.ncol();
| >> > |   mat X(Xr.begin(), n, k, false);
| >> > |   mat U;
| >> > |   vec s;
| >> > |   mat V;
| >> > |   svd(U, s, V, X);
| >> > |   return wrap(s);
| >> > | '
| >>
| >> Because the arma code is evaluating the singular vectors (U and V) as
| >> well as the singular values (S) whereas the R code is only evaluating
| >> the singular values.  There is considerably more effort required to
| >> evaluate the singular vectors in addition to the singular values.
| >>
| >> > | rcppsvd <- cxxfunction(signature(Xs="numeric"),
| >> > |                         arma.code,
| >> > |                         plugin="RcppArmadillo")
| >> > |
| >> > | A<-matrix(rnorm(5000^2), 5000)
| >> > |
| >> > | > system.time(rcppsvd(A))
| >> > |     user   system  elapsed
| >> > | 1992.406    4.862 1988.737
| >> > |
| >> > | > system.time(svd(A))
| >> > |    user  system elapsed
| >> > | 652.496   2.641 652.614
| >> > |
| >> > | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org>
| >> > wrote:
| >> > |
| >> > |
| >> > |     On 13 June 2012 at 10:57, Julian Smith wrote:
| >> > |     | I've been toying with both RcppArmadillo and RcppEigen the past
| >> > few days
| >> > |     and
| >> > |     | don't know which library to continue using. RcppEigen seems
| >> > really slick,
| >> > |     but
| >> > |     | appears to be lacking some of the decompositions I want and
| >> > isn't nearly
| >> > |     as
| >> > |     | fast to code. RcppArmadillo seems about as fast, easier to code
| >> > up etc.
| >> > |     What
| >> > |     | are some of the advantages/disadvantages of both?
| >> > |
| >> > |     That's pretty close.  I have been a fan of [Rcpp]Armadillo which I
| >> > find
| >> > |     easier to get my head around.  Doug, however, moved from
| >> > [Rcpp]Armadillo
| >> > |     to
| >> > |     [Rcpp]Eigen as it has some things he needs.  Eigen should have a
| >> > "larger"
| >> > |     API
| >> > |     than Armadillo, but I find the code and docs harder to navigate.
| >> > |
| >> > |     And you should find Eigen to be a little faster. Andreas Alfons
| >> > went as far
| >> > |     as building 'robustHD' using RcppArmadillo with a drop-in for
| >> > RcppEigen (in
| >> > |     package 'sparseLTSEigen'; both package names from memmory and I
| >> > may have
| >> > |     mistyped).  He reported a performance gain of around 25% for his
| >> > problem
| >> > |     sets.  On the 'fastLm' benchmark, we find the fast Eigen-based
| >> > |     decompositions
| >> > |     to be much faster than Armadillo.
| >> > |
| >> > |     | Can you call LAPACK or BLAS from either? Is there a wrapper in
| >> > RcppEigen
| >> > |     to
| >> > |     | call LAPACK functions? Want some other decomposition methods,
| >> > dont like
| >> > |     the
| >> > |     | JacobiSVD method in Eigen.
| >> > |
| >> > |     You need to differentiate between the Eigen and Armadillo docs
| >> > _for their
| >> > |     libraries_ and what happens when you access the Rcpp* variant from
| >> > R.
| >> > |
| >> > |     RcppArmadillo will use the very same LAPACK and BLAS libs your R
| >> > session
| >> > |     uses. So MKL, OpenBlas, ... are all options.  Eigen actually has
| >> > its own
| >> > |     code
| >> > |     outperforming LAPACK, so it doesn't  as much there.
| >> > |
| >> > |     Hope this helps,   Dirk (at useR!)
| >> > |
| >> > |     |
| >> > |     |
| >> > ----------------------------------------------------------------------
| >> > |     | _______________________________________________
| >> > |     | Rcpp-devel mailing list
| >> > |     | Rcpp-devel at lists.r-forge.r-project.org
| >> > |     |
| >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >> > |     --
| >> > |     Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| >> > |
| >> > |
| >> >
| >> > --
| >> > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| >> > _______________________________________________
| >> > Rcpp-devel mailing list
| >> > Rcpp-devel at lists.r-forge.r-project.org
| >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >
| >

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  


More information about the Rcpp-devel mailing list