[Rcpp-devel] Differences between RcppEigen and RcppArmadillo
Dirk Eddelbuettel
edd at debian.org
Thu Jun 14 16:11:45 CEST 2012
On 14 June 2012 at 08:01, Douglas Bates wrote:
| On Wed, Jun 13, 2012 at 6:53 PM, Julian Smith <hilbertspaces at gmail.com> wrote:
| > Doesn't svd in R by default compute D, U and V?
|
| > http://stat.ethz.ch/R-manual/R-patched/library/base/html/svd.html
|
| You're right but the default is the 'thin' U when X is n by p and n >=
| p. Does the svd in Armadillo return the full n by n matrix U?
Thanks for that earlier hint re 'thin' and 'full' SVDs.
| Another thing to check is what underlying Lapack routine is called.
| There are two: dgesvd, which is the older algorithm and the one with
| the expected name if you know the Lapack conventions, and dgesdd which
| is a newer and faster algorithm. R uses dgesdd.
Good call.
It looks like Armadillo uses dgesv(d), grep did not find dgesdd it seems: :
edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ grep dges *
atlas_bones.hpp: int wrapper_clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, double *A, const int lda, int *ipiv, double *B, const int ldb);
atlas_wrapper.hpp: return arma_atlas(clapack_dgesv)(Order, N, NRHS, (T*)A, lda, ipiv, (T*)B, ldb);
glue_histc_meat.hpp: "histc(): parameter 'edges' must be a vector"
lapack_bones.hpp: #define arma_dgesvd dgesvd
lapack_bones.hpp: #define arma_dgesv dgesv
lapack_bones.hpp: #define arma_dgesvd DGESVD
lapack_bones.hpp: #define arma_dgesv DGESV
lapack_bones.hpp: void arma_fortran(arma_dgesvd)(char* jobu, char* jobvt, blas_int* m, blas_int* n, double* a, blas_int* lda, double* s, double* u, blas_int* ldu, double* vt, blas_int* ldvt, double* work, blas_int* lwork, blas_int* info);
lapack_bones.hpp: void arma_fortran(arma_dgesv)(blas_int* n, blas_int* nrhs, double* a, blas_int* lda, blas_int* ipiv, double* b, blas_int* ldb, blas_int* info);
lapack_wrapper.hpp: arma_fortran(arma_dgesvd)(jobu, jobvt, m, n, (T*)a, lda, (T*)s, (T*)u, ldu, (T*)vt, ldvt, (T*)work, lwork, info);
lapack_wrapper.hpp: arma_fortran(arma_dgesv)(n, nrhs, (T*)a, lda, ipiv, (T*)b, ldb, info);
edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$
Conrad, any interest in switching to dgesdd?
Dirk, at useR and across the room from Doug
| > On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
| >>
| >> On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >> >
| >> > On 13 June 2012 at 15:05, Julian Smith wrote:
| >> > | I agree that RcppEigen is a little bit faster, but ease of use is
| >> > important to
| >> > | me, so I feel like RcppArmadillo might win out in my application.
| >> >
| >> > Yup, that my personal view too.
| >> >
| >> > | | RcppArmadillo will use the very same LAPACK and BLAS libs your R
| >> > session
| >> > | | uses. So MKL, OpenBlas, ... are all options. Eigen actually has its
| >> > own
| >> > | code
| >> > | | outperforming LAPACK, so it doesn't as much there.
| >> > |
| >> > | Why do you think R outperforms RcppArmadillo in this example below?
| >> > Anyway to
| >> > | speed this up?
| >> >
| >> > That is odd. "I guess it shouldn't." I shall take another look -- as I
| >> > understand it both should go to the same underlying Lapack routine. I
| >> > may
| >> > have to consult with Conrad on this.
| >> >
| >> > Thanks for posting a full and reproducible example!
| >> >
| >> > Dirk
| >> >
| >> > | require(RcppArmadillo)
| >> > | require(inline)
| >> > |
| >> > | arma.code <- '
| >> > | using namespace arma;
| >> > | NumericMatrix Xr(Xs);
| >> > | int n = Xr.nrow(), k = Xr.ncol();
| >> > | mat X(Xr.begin(), n, k, false);
| >> > | mat U;
| >> > | vec s;
| >> > | mat V;
| >> > | svd(U, s, V, X);
| >> > | return wrap(s);
| >> > | '
| >>
| >> Because the arma code is evaluating the singular vectors (U and V) as
| >> well as the singular values (S) whereas the R code is only evaluating
| >> the singular values. There is considerably more effort required to
| >> evaluate the singular vectors in addition to the singular values.
| >>
| >> > | rcppsvd <- cxxfunction(signature(Xs="numeric"),
| >> > | arma.code,
| >> > | plugin="RcppArmadillo")
| >> > |
| >> > | A<-matrix(rnorm(5000^2), 5000)
| >> > |
| >> > | > system.time(rcppsvd(A))
| >> > | user system elapsed
| >> > | 1992.406 4.862 1988.737
| >> > |
| >> > | > system.time(svd(A))
| >> > | user system elapsed
| >> > | 652.496 2.641 652.614
| >> > |
| >> > | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org>
| >> > wrote:
| >> > |
| >> > |
| >> > | On 13 June 2012 at 10:57, Julian Smith wrote:
| >> > | | I've been toying with both RcppArmadillo and RcppEigen the past
| >> > few days
| >> > | and
| >> > | | don't know which library to continue using. RcppEigen seems
| >> > really slick,
| >> > | but
| >> > | | appears to be lacking some of the decompositions I want and
| >> > isn't nearly
| >> > | as
| >> > | | fast to code. RcppArmadillo seems about as fast, easier to code
| >> > up etc.
| >> > | What
| >> > | | are some of the advantages/disadvantages of both?
| >> > |
| >> > | That's pretty close. I have been a fan of [Rcpp]Armadillo which I
| >> > find
| >> > | easier to get my head around. Doug, however, moved from
| >> > [Rcpp]Armadillo
| >> > | to
| >> > | [Rcpp]Eigen as it has some things he needs. Eigen should have a
| >> > "larger"
| >> > | API
| >> > | than Armadillo, but I find the code and docs harder to navigate.
| >> > |
| >> > | And you should find Eigen to be a little faster. Andreas Alfons
| >> > went as far
| >> > | as building 'robustHD' using RcppArmadillo with a drop-in for
| >> > RcppEigen (in
| >> > | package 'sparseLTSEigen'; both package names from memmory and I
| >> > may have
| >> > | mistyped). He reported a performance gain of around 25% for his
| >> > problem
| >> > | sets. On the 'fastLm' benchmark, we find the fast Eigen-based
| >> > | decompositions
| >> > | to be much faster than Armadillo.
| >> > |
| >> > | | Can you call LAPACK or BLAS from either? Is there a wrapper in
| >> > RcppEigen
| >> > | to
| >> > | | call LAPACK functions? Want some other decomposition methods,
| >> > dont like
| >> > | the
| >> > | | JacobiSVD method in Eigen.
| >> > |
| >> > | You need to differentiate between the Eigen and Armadillo docs
| >> > _for their
| >> > | libraries_ and what happens when you access the Rcpp* variant from
| >> > R.
| >> > |
| >> > | RcppArmadillo will use the very same LAPACK and BLAS libs your R
| >> > session
| >> > | uses. So MKL, OpenBlas, ... are all options. Eigen actually has
| >> > its own
| >> > | code
| >> > | outperforming LAPACK, so it doesn't as much there.
| >> > |
| >> > | Hope this helps, Dirk (at useR!)
| >> > |
| >> > | |
| >> > | |
| >> > ----------------------------------------------------------------------
| >> > | | _______________________________________________
| >> > | | Rcpp-devel mailing list
| >> > | | Rcpp-devel at lists.r-forge.r-project.org
| >> > | |
| >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >> > | --
| >> > | Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| >> > |
| >> > |
| >> >
| >> > --
| >> > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| >> > _______________________________________________
| >> > Rcpp-devel mailing list
| >> > Rcpp-devel at lists.r-forge.r-project.org
| >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >
| >
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Rcpp-devel
mailing list