[Rcppdevel] Differences between RcppEigen and RcppArmadillo
Dirk Eddelbuettel
edd at debian.org
Thu Jun 14 16:11:45 CEST 2012
On 14 June 2012 at 08:01, Douglas Bates wrote:
 On Wed, Jun 13, 2012 at 6:53 PM, Julian Smith <hilbertspaces at gmail.com> wrote:
 > Doesn't svd in R by default compute D, U and V?

 > http://stat.ethz.ch/Rmanual/Rpatched/library/base/html/svd.html

 You're right but the default is the 'thin' U when X is n by p and n >=
 p. Does the svd in Armadillo return the full n by n matrix U?
Thanks for that earlier hint re 'thin' and 'full' SVDs.
 Another thing to check is what underlying Lapack routine is called.
 There are two: dgesvd, which is the older algorithm and the one with
 the expected name if you know the Lapack conventions, and dgesdd which
 is a newer and faster algorithm. R uses dgesdd.
Good call.
It looks like Armadillo uses dgesv(d), grep did not find dgesdd it seems: :
edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ grep dges *
atlas_bones.hpp: int wrapper_clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, double *A, const int lda, int *ipiv, double *B, const int ldb);
atlas_wrapper.hpp: return arma_atlas(clapack_dgesv)(Order, N, NRHS, (T*)A, lda, ipiv, (T*)B, ldb);
glue_histc_meat.hpp: "histc(): parameter 'edges' must be a vector"
lapack_bones.hpp: #define arma_dgesvd dgesvd
lapack_bones.hpp: #define arma_dgesv dgesv
lapack_bones.hpp: #define arma_dgesvd DGESVD
lapack_bones.hpp: #define arma_dgesv DGESV
lapack_bones.hpp: void arma_fortran(arma_dgesvd)(char* jobu, char* jobvt, blas_int* m, blas_int* n, double* a, blas_int* lda, double* s, double* u, blas_int* ldu, double* vt, blas_int* ldvt, double* work, blas_int* lwork, blas_int* info);
lapack_bones.hpp: void arma_fortran(arma_dgesv)(blas_int* n, blas_int* nrhs, double* a, blas_int* lda, blas_int* ipiv, double* b, blas_int* ldb, blas_int* info);
lapack_wrapper.hpp: arma_fortran(arma_dgesvd)(jobu, jobvt, m, n, (T*)a, lda, (T*)s, (T*)u, ldu, (T*)vt, ldvt, (T*)work, lwork, info);
lapack_wrapper.hpp: arma_fortran(arma_dgesv)(n, nrhs, (T*)a, lda, ipiv, (T*)b, ldb, info);
edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$
Conrad, any interest in switching to dgesdd?
Dirk, at useR and across the room from Doug
 > On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
 >>
 >> On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
 >> >
 >> > On 13 June 2012 at 15:05, Julian Smith wrote:
 >> >  I agree that RcppEigen is a little bit faster, but ease of use is
 >> > important to
 >> >  me, so I feel like RcppArmadillo might win out in my application.
 >> >
 >> > Yup, that my personal view too.
 >> >
 >> >   RcppArmadillo will use the very same LAPACK and BLAS libs your R
 >> > session
 >> >   uses. So MKL, OpenBlas, ... are all options. Eigen actually has its
 >> > own
 >> >  code
 >> >   outperforming LAPACK, so it doesn't as much there.
 >> > 
 >> >  Why do you think R outperforms RcppArmadillo in this example below?
 >> > Anyway to
 >> >  speed this up?
 >> >
 >> > That is odd. "I guess it shouldn't." I shall take another look  as I
 >> > understand it both should go to the same underlying Lapack routine. I
 >> > may
 >> > have to consult with Conrad on this.
 >> >
 >> > Thanks for posting a full and reproducible example!
 >> >
 >> > Dirk
 >> >
 >> >  require(RcppArmadillo)
 >> >  require(inline)
 >> > 
 >> >  arma.code < '
 >> >  using namespace arma;
 >> >  NumericMatrix Xr(Xs);
 >> >  int n = Xr.nrow(), k = Xr.ncol();
 >> >  mat X(Xr.begin(), n, k, false);
 >> >  mat U;
 >> >  vec s;
 >> >  mat V;
 >> >  svd(U, s, V, X);
 >> >  return wrap(s);
 >> >  '
 >>
 >> Because the arma code is evaluating the singular vectors (U and V) as
 >> well as the singular values (S) whereas the R code is only evaluating
 >> the singular values. There is considerably more effort required to
 >> evaluate the singular vectors in addition to the singular values.
 >>
 >> >  rcppsvd < cxxfunction(signature(Xs="numeric"),
 >> >  arma.code,
 >> >  plugin="RcppArmadillo")
 >> > 
 >> >  A<matrix(rnorm(5000^2), 5000)
 >> > 
 >> >  > system.time(rcppsvd(A))
 >> >  user system elapsed
 >> >  1992.406 4.862 1988.737
 >> > 
 >> >  > system.time(svd(A))
 >> >  user system elapsed
 >> >  652.496 2.641 652.614
 >> > 
 >> >  On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org>
 >> > wrote:
 >> > 
 >> > 
 >> >  On 13 June 2012 at 10:57, Julian Smith wrote:
 >> >   I've been toying with both RcppArmadillo and RcppEigen the past
 >> > few days
 >> >  and
 >> >   don't know which library to continue using. RcppEigen seems
 >> > really slick,
 >> >  but
 >> >   appears to be lacking some of the decompositions I want and
 >> > isn't nearly
 >> >  as
 >> >   fast to code. RcppArmadillo seems about as fast, easier to code
 >> > up etc.
 >> >  What
 >> >   are some of the advantages/disadvantages of both?
 >> > 
 >> >  That's pretty close. I have been a fan of [Rcpp]Armadillo which I
 >> > find
 >> >  easier to get my head around. Doug, however, moved from
 >> > [Rcpp]Armadillo
 >> >  to
 >> >  [Rcpp]Eigen as it has some things he needs. Eigen should have a
 >> > "larger"
 >> >  API
 >> >  than Armadillo, but I find the code and docs harder to navigate.
 >> > 
 >> >  And you should find Eigen to be a little faster. Andreas Alfons
 >> > went as far
 >> >  as building 'robustHD' using RcppArmadillo with a dropin for
 >> > RcppEigen (in
 >> >  package 'sparseLTSEigen'; both package names from memmory and I
 >> > may have
 >> >  mistyped). He reported a performance gain of around 25% for his
 >> > problem
 >> >  sets. On the 'fastLm' benchmark, we find the fast Eigenbased
 >> >  decompositions
 >> >  to be much faster than Armadillo.
 >> > 
 >> >   Can you call LAPACK or BLAS from either? Is there a wrapper in
 >> > RcppEigen
 >> >  to
 >> >   call LAPACK functions? Want some other decomposition methods,
 >> > dont like
 >> >  the
 >> >   JacobiSVD method in Eigen.
 >> > 
 >> >  You need to differentiate between the Eigen and Armadillo docs
 >> > _for their
 >> >  libraries_ and what happens when you access the Rcpp* variant from
 >> > R.
 >> > 
 >> >  RcppArmadillo will use the very same LAPACK and BLAS libs your R
 >> > session
 >> >  uses. So MKL, OpenBlas, ... are all options. Eigen actually has
 >> > its own
 >> >  code
 >> >  outperforming LAPACK, so it doesn't as much there.
 >> > 
 >> >  Hope this helps, Dirk (at useR!)
 >> > 
 >> >  
 >> >  
 >> > 
 >> >   _______________________________________________
 >> >   Rcppdevel mailing list
 >> >   Rcppdevel at lists.rforge.rproject.org
 >> >  
 >> > https://lists.rforge.rproject.org/cgibin/mailman/listinfo/rcppdevel
 >> >  
 >> >  Dirk Eddelbuettel  edd at debian.org  http://dirk.eddelbuettel.com
 >> > 
 >> > 
 >> >
 >> > 
 >> > Dirk Eddelbuettel  edd at debian.org  http://dirk.eddelbuettel.com
 >> > _______________________________________________
 >> > Rcppdevel mailing list
 >> > Rcppdevel at lists.rforge.rproject.org
 >> > https://lists.rforge.rproject.org/cgibin/mailman/listinfo/rcppdevel
 >
 >

Dirk Eddelbuettel  edd at debian.org  http://dirk.eddelbuettel.com
More information about the Rcppdevel
mailing list