[Rcpp-devel] Speed of RCppEigen Cholesky decomposition on sparse matrix

Serguei Sokol serguei.sokol at gmail.com
Tue Nov 27 15:33:55 CET 2018


Le 26/11/2018 à 18:23, Hoffman, Gabriel a écrit :
> I am developing a statistical model and I have a prototype working in R
> code.  I make extensive use of sparse matrices, so the R code is pretty
> fast, but hoped that using RCppEigen to evaluate the log-likelihood
> function could avoid a lot of memory copying and be substantially
> faster.  However, in a simple  example I am seeing that RCppEigen is
> 3-5x slower than standard R code for cholesky decomposition of a sparse
> matrix.  This is the case on R 3.5.1 using RcppEigen_0.3.3.4.0 on both
> OS X and CentOS 6.9.
> 
> Since this simple operation is so much slower it doesn't seem like
> using RCppEigen is worth it in this case.  Is this an issue with BLAS,
> some libraries or compiler options, or is R code really the fastest
> option?
After few checks, it seems to be a test issue. Matrix package stores the 
decomposition somewhere in attributes of the submitted matrix. So the 
the repetitive calls requiring chol() decomposition are not really doing 
the job. Instead, previously stored result is reused.

You can easily convince yourself by "modifying" the matrix C (and thus 
invalidating previous decomposition) by doing something like "C + 0." :

system.time(replicate(10, chol( C )))
#utilisateur     système      écoulé
#      0.459       0.011       0.471
system.time(replicate(10, chol( C+0. )))
#utilisateur     système      écoulé
#      5.365       0.060       5.425
system.time(replicate(10, CholSparse( C+0. )))
#utilisateur     système      écoulé
#      3.627       0.035       3.665

On my machine, I have almost identical timing for CholSparse() with or 
without C modification:

system.time(replicate(10, CholSparse( C )))
#utilisateur     système      écoulé
#      3.283       0.004       3.289
which proves that Eigen doesn't store the decomposition for future reuse 
and redo the decomposition at each call on the same matrix.

Best,
Serguei.

> 
> 
> library(Matrix)
> library(inline)
> 
> # construct sparse matrix
> #########################
> 
> # construct a matrix C that is N x N with S total entries
> # set C = crossprod(X)
> N = 100000
> S = 1000000
> i = sample(1:1000, S, replace=TRUE)
> j = sample(1:1000, S, replace=TRUE)
> values = runif(S, 0, .3)
> X = sparseMatrix(i=i, j=j, x = values, symmetric=FALSE )
> 
> C = as(crossprod(X), "dgCMatrix")
> 
> # check sparsity fraction
> S / N^2
> 
> # define RCppEigen code
> CholeskyCppSparse<-'
> using Rcpp::as;
> using Eigen::Map;
> using Eigen::SparseMatrix;
> using Eigen::MappedSparseMatrix;
> using Eigen::SimplicialLLT;
> 
> // get data into RcppEigen
> const MappedSparseMatrix<double> Sigma(as<MappedSparseMatrix<double> 
>  >(Sigma_in));
> 
> // compute Cholesky
> typedef SimplicialLLT<SparseMatrix<double> > SpChol;
> const SpChol Ch(Sigma);
> '
> 
> CholSparse <- cxxfunction(signature(Sigma_in = "dgCMatrix"), 
> CholeskyCppSparse, plugin = "RcppEigen")
> 
> # compare times
> system.time(replicate(10, chol( C )))
> # output:
> #   user  system elapsed
> #  0.341   0.014   0.355
> 
> system.time(replicate(10, CholSparse( C )))
> # output:
> #   user  system elapsed
> # 1.639   0.046   1.687
> 
>     sessionInfo()
> 
> R version 3.5.1 (2018-07-02)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: macOS  10.14
> 
> Matrix products: default
> BLAS:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
> LAPACK:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods   base
> 
> other attached packages:
> [1] inline_0.3.15 Matrix_1.2-15
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.5.1      RcppEigen_0.3.3.4.0 Rcpp_1.0.0
> [4] grid_3.5.1          lattice_0.20-38
> 
> Changing the size of the matrix and the number of entries does not
> change the relative times much
> 
> Thanks,
> - Gabriel
> 
> 
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> 



More information about the Rcpp-devel mailing list