[Rcpp-devel] Sparse matrices with RcppArmadillo

Romain Francois romain at r-enthusiasts.com
Sat Dec 8 20:55:52 CET 2012


Le 08/12/12 19:54, Douglas Bates a écrit :
> On Sat, Dec 8, 2012 at 10:35 AM, c s <conradsand.arma at gmail.com
> <mailto:conradsand.arma at gmail.com>> wrote:
>
>     Armadillo sparse matrices are stored in Compressed Sparse Column format:
>
>     http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29
>
>     This layout is used by a majority of external solvers.
>
>     It would be far more efficient to take this layout into account when
>     copying matrices, instead of blindly (and slowly) copying element by
>     element.
>
>
> dgCMatrix objects also use compressed sparse column format with
> zero-based indices so it is likely that it would only be necessary to
> copy the contents of the arrays of column pointers (the "p" slot), row
> indices (the "i" slot) and values of the non-zeros (the "x" slot).  It
> may even be possible to just map the pointers for a read-only sparse matrix.

Alright

> Romain:
>   In general it is very slow to insert non-zero elements into this
> format.  In the worst case the entire structure must be copied and
> extended for each insertion.  We have to keep telling people about this
> when they tell us that sparse matrices in R are so slow to work with.
>
>   You have seen what the layout of a dgCMatrix is so it is only
> necessary to find the desired components of a arma::sp_mat object.  I'll
> take a look once I get some things installed.  If it is desirable the as
> and wrap methods for the Eigen::SparseMatrix<T> and
> Eigen::MappedSparseMatrix<T> classes should be fairly easy to adapt for
> arma::sp_mat class.

Yes. Conrad left a back door open so that we can add things **inside** 
the SpMat template, similar to what we have done in dense matrix classes.

Going from there I was looking at the copy constructor of a SpMat, which 
eventually is a call to init:

/**
  * Copy from another matrix.
  */
template<typename eT>
inline
void
SpMat<eT>::init(const SpMat<eT>& x)
   {
   arma_extra_debug_sigprint();

   // Ensure we are not initializing to ourselves.
   if (this != &x)
     {
     init(x.n_rows, x.n_cols);

     // values and row_indices may not be null.
     if (values != NULL)
       {
       memory::release(values);
       memory::release(row_indices);
       }

     access::rw(values)      = memory::acquire_chunked<eT> 
(x.n_nonzero + 1);
     access::rw(row_indices) = 
memory::acquire_chunked<uword>(x.n_nonzero + 1);

     // Now copy over the elements.
     arrayops::copy(access::rwp(values),      x.values,      x.n_nonzero 
+ 1);
     arrayops::copy(access::rwp(row_indices), x.row_indices, x.n_nonzero 
+ 1);
     arrayops::copy(access::rwp(col_ptrs),    x.col_ptrs,    x.n_cols + 1);

     access::rw(n_nonzero) = x.n_nonzero;
     }
   }

So it all looks very doable.

Romain


>     On Sunday, December 9, 2012, Romain Francois
>     <romain at r-enthusiasts.com <mailto:romain at r-enthusiasts.com>> wrote:
>      > Le 08/12/12 09:45, Søren Højsgaard a écrit :
>      >>
>      >> Dear all,
>      >>
>      >> I want to use a matrix (of type "dgCMatrix" from the Matrix
>     package) in RcppArmadillo, so I do:
>      >>
>      >> library(inline)
>      >> src <- '
>      >> using namespace arma;
>      >> using namespace Rcpp;
>      >> SpMat<double> X = as<SpMat<double> >(XX_);
>      >> '
>      >> foo <- cxxfunction(signature(XX_=""), body=src,
>     plugin="RcppArmadillo")
>      >>
>      >> - but this fails. It seems to me (browsing the web) that SpMat
>     are supported, but I might be wrong here. I have no indication that
>     dgCMatrix matrices can be "converted" to SpMat's. I know that I can
>     work with dgCMatrix matrices with RcppEigen, but what I need is to
>     extract submatrices and that is very easy to do with RcppArmadillo.
>      >>
>      >> Any thoughts? Thanks in advance.
>      >> Regards
>      >> Søren
>      >
>      > Doug might know better about the internals of Matrix types. This
>     is just following the recipee from how these are handled in RcppEigen:
>      >
>      > #include <RcppArmadillo.h>
>      > // [[Rcpp::depends(RcppArmadillo)]]
>      > using namespace Rcpp ;
>      >
>      > // [[Rcpp::export]]
>      > void convert(S4 mat){
>      >     IntegerVector dims = mat.slot( "Dim" ) ;
>      >     IntegerVector i = mat.slot( "i" ) ;
>      >     IntegerVector p = mat.slot( "p" ) ;
>      >     NumericVector x = mat.slot( "x" ) ;
>      >
>      >     int nrow = dims[0], ncol = dims[1] ;
>      >     arma::sp_mat res( nrow, ncol) ;
>      >     for(int j = 0; j < ncol; ++j) {
>      >         for (int k = p[j]; k < p[j + 1]; ++k) res( i[k], j ) = x[k];
>      >     }
>      >     std::cout << res << std::endl ;
>      >
>      > }
>      >
>      >
>      > /*** R
>      >     require(Matrix)
>      >     i <- c(1,3:8); j <- c(2,9,6:10); x <- 7 * (1:7)
>      >     ( A <- sparseMatrix(i, j, x = x) )
>      >
>      >     convert(A)
>      > ***/
>      >
>      > I don't think there is a better way to fill multiple values ina
>     SpMat, maybe Conrad has insights.
>      >
>      > Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible



More information about the Rcpp-devel mailing list