[Rcpp-devel] [RcppArmadillo] Result of Rcpp Wrap() for Sparse Matrix

Serguei Sokol serguei.sokol at gmail.com
Wed Jun 14 17:55:49 CEST 2017


Le 14/06/2017 à 17:33, Douglas Bates a écrit :
> 
> 
> On Wed, Jun 14, 2017 at 9:06 AM Serguei Sokol <serguei.sokol at gmail.com <mailto:serguei.sokol at gmail.com>> wrote:
> 
>     Le 14/06/2017 à 15:21, Douglas Bates a écrit :
>      >
>      >
>      > On Wed, Jun 14, 2017 at 3:59 AM Serguei Sokol <serguei.sokol at gmail.com <mailto:serguei.sokol at gmail.com> <mailto:serguei.sokol at gmail.com
>     <mailto:serguei.sokol at gmail.com>>> wrote:
>      >
>      >     Le 13/06/2017 à 18:24, Douglas Bates a écrit :
>      >      > On Tue, Jun 13, 2017 at 10:56 AM Binxiang Ni <binxiangni at gmail.com <mailto:binxiangni at gmail.com> <mailto:binxiangni at gmail.com
>     <mailto:binxiangni at gmail.com>> <mailto:binxiangni at gmail.com <mailto:binxiangni at gmail.com>
>      >     <mailto:binxiangni at gmail.com <mailto:binxiangni at gmail.com>>>> wrote:
>      >      >
>      >      >     Hi,
>      >      >
>      >      >     I am working on fixing sparse matrix conversion for RcppArmadillo. Now a problem comes up to me: what kind of sparse matrix is expected to
>     pass from
>      >      >     Armadillo to R? That is, what should the result of wrap() be? dgCMatrix(if logical, lgCMatrix or ngCMatrix)  or their original type?
>      >      >
>      >      >
>      >      > What do you mean by "their original type"?
>      >      >
>      >      > It seems that the correspondence is
>      >      > Armadillo           Matrix package
>      >      > sp_mat       <=> dgCMatrix
>      >      > sp_cx_mat <=> zgCMatrix
>      >      > sp_imat      <=> igCMatrix
>      >     I would also consider the format used in a package slam.
>      >     It simply stores the indexes and non-zero values in a triplet (i,j,v).
>      >
>      >
>      > That is the format of the dgTMatrix class from the Matrix package for R but not, as far as I can tell, in Armadillo.  A brief glance at the Armadillo
>      > documentation indicates that sparse matrices are always in the compressed sparse column (CSC) format.
>     Indeed, but nothing prevents Binxiang to develop a wrap() that will convert
>     armadillo format to one or many of R formats, right?
> 
> 
> Why?
Sure, Matrix is very versatile and rich in features but the price for this is its heavy weight.
It can take several seconds to load it up. On my rather mighty PC (Intel Xeon E5-2609 v2 @ 2.50GHz with 16 GB of memory),
I have:
 > system.time(library(Matrix))
utilisateur     système      écoulé
       1.427       0.052       1.619

I don't have my laptop here but the load time can be longer.
While for slam it takes only a fraction of second:

 > system.time(library(slam))
utilisateur     système      écoulé
       0.012       0.000       0.011
When slam can suffice, why not to use it?

> Is there a reason for doing type conversion from the dgCMatrix format to another format in an Rcpp wrap function instead of with the existing functions 
> from the Matrix package?
> 
> Bear in mind that dgCMatrix is an efficient format both in terms of the amount of memory required  (that's the "compressed" part of the name) and in terms of 
> performing operations with the matrix.  Most operations on sparse matrices stored in the triplet format start by creating a CSC or CSR (compressed sparse row) 
> form of the matrix anyway.
In Matrix package, I presume?
Few basic operations that I have seen in slam, stay with triplet format.
So if a user did not load Matrix package and want to use e.g. slam format,
it would be great if wrap() could give him expected format.


More information about the Rcpp-devel mailing list