[Rcpp-devel] Inverting a triangular matrix appears slow using Armadillo and openblas

Dirk Eddelbuettel edd at debian.org
Fri Oct 18 20:04:35 CEST 2024


On 18 October 2024 at 09:39, Robin Liu wrote:
| Dear all,
| 
| Computing the Cholesky factor of a matrix is on the order of n^3 while
| inverting a triangular system is n^2. However, it appears that these two
| operations take roughly the same time using armadillo.
| 
| Unit: milliseconds
|      ticker   mean     sd    min    max neval
|        chol  80.81  3.001  75.84  88.11    20
|    inv-chol  84.89  4.449  80.78  97.30    20
|  inv-direct 274.39 10.787 260.85 303.20    20
| 
| Inverting the triangular Cholesky factor is fast relative to direct inversion,
| but is of the same order as computing the factor itself. Is this to be
| expected? I am running with 
| BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
| LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;
|  LAPACK version 3.10.0 
| along with OPENBLAS_NUM_THREADS=1.

The point of OpenBLAS is to be multithreaded. Why do you constrain it to one thread?

| I admit my question is not specific to Rcpp or RcppArmadillo, but I thought
| someone on this list could advise. 

What is the reference point?  "Slow" is relative to what here? Did you try to
vary the matrix dimension too and see if the change in time is aligned with
your expectations?

In any event, Rcpp "is a messenger here" as is RcppArmadillo as is Armadillo
as the actual operations are plain BLAS operations and this list may no be
the most focussed forum to discuss the topic.

Dirk
 
| My code follows:
| ```
| library(Rcpp)
| library(RcppArmadillo)
| 
| src <-
|   r"(#include <RcppArmadillo.h>
| #include "/usr/local/lib/R/site-library/RcppClock/include/RcppClock.h"
| using namespace arma;
| 
| // [[Rcpp:depends(RcppClock)]]
| // [[Rcpp::depends(RcppArmadillo)]]
| // [[Rcpp::export]]
| void chol_and_inv(const arma::mat &V) {
|   Rcpp::Clock clock;
|   mat R, Rinv, Vinv;
|   for (int i = 0; i < 20; i++) {
|     clock.tick("chol");
|     R = chol(V);
|     clock.tock("chol");
|     clock.tick("inv-chol");
|     Rinv = inv(trimatu(R));
|     clock.tock("inv-chol");
|     clock.tick("inv-direct");
|     Vinv = inv(V);
|     clock.tock("inv-direct");
|   }
|   clock.stop("timer");
|   return;
| })"
| 
| # Create large PD matrix
| size <- 2000
| set.seed(123)
| m <- matrix(rnorm(size^2), size, size)
| m <- tcrossprod(m)
| diag(m) <- rowSums(abs(m)) + 2
| 
| sourceCpp(code = src)
| result <- chol_and_inv(m)
| library(RcppClock)
| timer
| ```
| Best wishes,
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org


More information about the Rcpp-devel mailing list