[Rcpp-devel] Result of Rcpp Wrap() for Sparse Matrix

Dmitriy Selivanov selivanov.dmitriy at gmail.com
Wed Jun 14 18:44:55 CEST 2017


My 2 cents. Last couple of years I used sparse matrices a lot. Matrix
package is really great. I'm not sure I understand issue with wrapping - as
Doug said CSC format is main in both Armadillo and Matrix. Given matrix in
CSC format (dgCMatrix/CsparseMatrix) it is trivial to convert it to COO or
CSR with as(x, "TsparseMatrix") / as(x, "RsparseMatrix").

Second point is about slam package and COO format. I didn't use it, but
used scipy, Armadillo, Eigen. And none of these packages use COO format for
operations on matrices... I doubt it could be efficient.

Third point is that I have feeling that nowadays CSR format is more
mainstream. For instance Eigen implements multithreaded sparse - dense
multiplications and sparse solvers (
https://eigen.tuxfamily.org/dox/TopicMultiThreading.html). Same story about
sparse BLAS with Intel MKL - it works with CSR matrices. I realize that CSR
= transposed CSC, but still it is not convenient to transpose mind each
time. (Would be great to add more support for CSR matrices, but this is out
of scope of this discussion).

And last my observation - I agree with Doug that it seems that Eigen has
much stronger support for operations with sparse matrices.

14 июн. 2017 г. 19:55 пользователь <rcpp-devel-request at lists.r-
forge.r-project.org> написал:

Send Rcpp-devel mailing list submissions to
        rcpp-devel at lists.r-forge.r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo
/rcpp-devel

or, via email, send a message with subject or body 'help' to
        rcpp-devel-request at lists.r-forge.r-project.org

You can reach the person managing the list at
        rcpp-devel-owner at lists.r-forge.r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Rcpp-devel digest..."


Today's Topics:

   1. Re: [RcppArmadillo] Result of Rcpp Wrap() for Sparse      Matrix
      (Douglas Bates)
   2. Re: [RcppArmadillo] Result of Rcpp Wrap() for Sparse Matrix
      (Dirk Eddelbuettel)
   3. Re: [RcppArmadillo] Result of Rcpp Wrap() for Sparse Matrix
      (Serguei Sokol)
   4. Re: [RcppArmadillo] Result of Rcpp Wrap() for Sparse      Matrix
      (Douglas Bates)
   5. Re: [RcppArmadillo] Result of Rcpp Wrap() for Sparse Matrix
      (Serguei Sokol)


----------------------------------------------------------------------

Message: 1
Date: Wed, 14 Jun 2017 13:21:54 +0000
From: Douglas Bates <bates at stat.wisc.edu>
To: serguei.sokol at gmail.com, Binxiang Ni <binxiangni at gmail.com>,
        rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] [RcppArmadillo] Result of Rcpp Wrap() for
        Sparse  Matrix
Message-ID:
        <CAO7JsnTo8bA6LTsHz0udyjF-KAaE2kp-rKb13tyudg6gV=JiAQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Wed, Jun 14, 2017 at 3:59 AM Serguei Sokol <serguei.sokol at gmail.com>
wrote:

> Le 13/06/2017 ? 18:24, Douglas Bates a ?crit :
> > On Tue, Jun 13, 2017 at 10:56 AM Binxiang Ni <binxiangni at gmail.com
> <mailto:binxiangni at gmail.com>> wrote:
> >
> >     Hi,
> >
> >     I am working on fixing sparse matrix conversion for RcppArmadillo.
> Now a problem comes up to me: what kind of sparse matrix is expected to
> pass from
> >     Armadillo to R? That is, what should the result of wrap() be?
> dgCMatrix(if logical, lgCMatrix or ngCMatrix)  or their original type?
> >
> >
> > What do you mean by "their original type"?
> >
> > It seems that the correspondence is
> > Armadillo           Matrix package
> > sp_mat       <=> dgCMatrix
> > sp_cx_mat <=> zgCMatrix
> > sp_imat      <=> igCMatrix
> I would also consider the format used in a package slam.
> It simply stores the indexes and non-zero values in a triplet (i,j,v).
>

That is the format of the dgTMatrix class from the Matrix package for R but
not, as far as I can tell, in Armadillo.  A brief glance at the Armadillo
documentation indicates that sparse matrices are always in the compressed
sparse column (CSC) format.

I would point out that the sparse matrix facilities in Eigen and RcppEigen
are much more extensive than those in Armadillo.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
attachments/20170614/bc4c3163/attachment-0001.html>

------------------------------

Message: 2
Date: Wed, 14 Jun 2017 09:01:30 -0500
From: Dirk Eddelbuettel <edd at debian.org>
To: serguei.sokol at gmail.com
Cc: rcpp-devel at lists.r-forge.r-project.org, Binxiang Ni
        <binxiangni at gmail.com>
Subject: Re: [Rcpp-devel] [RcppArmadillo] Result of Rcpp Wrap() for
        Sparse Matrix
Message-ID: <22849.16826.22396.339250 at max.eddelbuettel.com>
Content-Type: text/plain; charset=iso-8859-1


On 14 June 2017 at 11:00, Serguei Sokol wrote:
| Le 13/06/2017 ? 18:24, Douglas Bates a ?crit :
| > On Tue, Jun 13, 2017 at 10:56 AM Binxiang Ni <binxiangni at gmail.com
<mailto:binxiangni at gmail.com>> wrote:
| >
| >     Hi,
| >
| >     I am working on fixing sparse matrix conversion for RcppArmadillo.
Now a problem comes up to me: what kind of sparse matrix is expected to
pass from
| >     Armadillo to R? That is, what should the result of wrap() be?
dgCMatrix(if logical, lgCMatrix or ngCMatrix)  or their original type?
| >
| >
| > What do you mean by "their original type"?
| >
| > It seems that the correspondence is
| > Armadillo           Matrix package
| > sp_mat       <=> dgCMatrix
| > sp_cx_mat <=> zgCMatrix
| > sp_imat      <=> igCMatrix
| I would also consider the format used in a package slam.
| It simply stores the indexes and non-zero values in a triplet (i,j,v).

There is more here:  https://en.wikipedia.org/wiki/Sparse_matrix

But it would probably be good to hear from some actual users of sparse
matrices such as Doug (thanks for piping in already!), Soren or anybody else
with exposure to sparse matrices, ideally via CRAN packages we can wire up
for testing.


Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org


------------------------------

Message: 3
Date: Wed, 14 Jun 2017 16:06:58 +0200
From: Serguei Sokol <serguei.sokol at gmail.com>
To: Douglas Bates <bates at stat.wisc.edu>, Binxiang Ni
        <binxiangni at gmail.com>, rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] [RcppArmadillo] Result of Rcpp Wrap() for
        Sparse Matrix
Message-ID: <1ade20d8-f08c-df7b-e147-539e4b1babff at gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

Le 14/06/2017 ? 15:21, Douglas Bates a ?crit :
>
>
> On Wed, Jun 14, 2017 at 3:59 AM Serguei Sokol <serguei.sokol at gmail.com
<mailto:serguei.sokol at gmail.com>> wrote:
>
>     Le 13/06/2017 ? 18:24, Douglas Bates a ?crit :
>      > On Tue, Jun 13, 2017 at 10:56 AM Binxiang Ni <binxiangni at gmail.com
<mailto:binxiangni at gmail.com> <mailto:binxiangni at gmail.com
>     <mailto:binxiangni at gmail.com>>> wrote:
>      >
>      >     Hi,
>      >
>      >     I am working on fixing sparse matrix conversion for
RcppArmadillo. Now a problem comes up to me: what kind of sparse matrix is
expected to pass from
>      >     Armadillo to R? That is, what should the result of wrap() be?
dgCMatrix(if logical, lgCMatrix or ngCMatrix)  or their original type?
>      >
>      >
>      > What do you mean by "their original type"?
>      >
>      > It seems that the correspondence is
>      > Armadillo           Matrix package
>      > sp_mat       <=> dgCMatrix
>      > sp_cx_mat <=> zgCMatrix
>      > sp_imat      <=> igCMatrix
>     I would also consider the format used in a package slam.
>     It simply stores the indexes and non-zero values in a triplet (i,j,v).
>
>
> That is the format of the dgTMatrix class from the Matrix package for R
but not, as far as I can tell, in Armadillo.  A brief glance at the
Armadillo
> documentation indicates that sparse matrices are always in the compressed
sparse column (CSC) format.
Indeed, but nothing prevents Binxiang to develop a wrap() that will convert
armadillo format to one or many of R formats, right?


------------------------------

Message: 4
Date: Wed, 14 Jun 2017 15:33:05 +0000
From: Douglas Bates <bates at stat.wisc.edu>
To: serguei.sokol at gmail.com, Binxiang Ni <binxiangni at gmail.com>,
        rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] [RcppArmadillo] Result of Rcpp Wrap() for
        Sparse  Matrix
Message-ID:
        <CAO7JsnT7cj3pAqF7rEJ-EV_qke+Lr2suZu1AVpkPNT2O4bjVcg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Wed, Jun 14, 2017 at 9:06 AM Serguei Sokol <serguei.sokol at gmail.com>
wrote:

> Le 14/06/2017 ? 15:21, Douglas Bates a ?crit :
> >
> >
> > On Wed, Jun 14, 2017 at 3:59 AM Serguei Sokol <serguei.sokol at gmail.com
> <mailto:serguei.sokol at gmail.com>> wrote:
> >
> >     Le 13/06/2017 ? 18:24, Douglas Bates a ?crit :
> >      > On Tue, Jun 13, 2017 at 10:56 AM Binxiang Ni <
> binxiangni at gmail.com <mailto:binxiangni at gmail.com> <mailto:
> binxiangni at gmail.com
> >     <mailto:binxiangni at gmail.com>>> wrote:
> >      >
> >      >     Hi,
> >      >
> >      >     I am working on fixing sparse matrix conversion for
> RcppArmadillo. Now a problem comes up to me: what kind of sparse matrix is
> expected to pass from
> >      >     Armadillo to R? That is, what should the result of wrap() be?
> dgCMatrix(if logical, lgCMatrix or ngCMatrix)  or their original type?
> >      >
> >      >
> >      > What do you mean by "their original type"?
> >      >
> >      > It seems that the correspondence is
> >      > Armadillo           Matrix package
> >      > sp_mat       <=> dgCMatrix
> >      > sp_cx_mat <=> zgCMatrix
> >      > sp_imat      <=> igCMatrix
> >     I would also consider the format used in a package slam.
> >     It simply stores the indexes and non-zero values in a triplet
> (i,j,v).
> >
> >
> > That is the format of the dgTMatrix class from the Matrix package for R
> but not, as far as I can tell, in Armadillo.  A brief glance at the
> Armadillo
> > documentation indicates that sparse matrices are always in the
> compressed sparse column (CSC) format.
> Indeed, but nothing prevents Binxiang to develop a wrap() that will
convert
> armadillo format to one or many of R formats, right?
>

Why? Is there a reason for doing type conversion from the dgCMatrix format
to another format in an Rcpp wrap function instead of with the existing
functions from the Matrix package?

Bear in mind that dgCMatrix is an efficient format both in terms of the
amount of memory required  (that's the "compressed" part of the name) and
in terms of performing operations with the matrix.  Most operations on
sparse matrices stored in the triplet format start by creating a CSC or CSR
(compressed sparse row) form of the matrix anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
attachments/20170614/7a9fef46/attachment-0001.html>

------------------------------

Message: 5
Date: Wed, 14 Jun 2017 17:55:49 +0200
From: Serguei Sokol <serguei.sokol at gmail.com>
To: Douglas Bates <bates at stat.wisc.edu>, Binxiang Ni
        <binxiangni at gmail.com>, rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] [RcppArmadillo] Result of Rcpp Wrap() for
        Sparse Matrix
Message-ID: <4cbe9f66-6755-70bc-3a59-2a660015b596 at gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

Le 14/06/2017 ? 17:33, Douglas Bates a ?crit :
>
>
> On Wed, Jun 14, 2017 at 9:06 AM Serguei Sokol <serguei.sokol at gmail.com
<mailto:serguei.sokol at gmail.com>> wrote:
>
>     Le 14/06/2017 ? 15:21, Douglas Bates a ?crit :
>      >
>      >
>      > On Wed, Jun 14, 2017 at 3:59 AM Serguei Sokol <
serguei.sokol at gmail.com <mailto:serguei.sokol at gmail.com> <mailto:
serguei.sokol at gmail.com
>     <mailto:serguei.sokol at gmail.com>>> wrote:
>      >
>      >     Le 13/06/2017 ? 18:24, Douglas Bates a ?crit :
>      >      > On Tue, Jun 13, 2017 at 10:56 AM Binxiang Ni <
binxiangni at gmail.com <mailto:binxiangni at gmail.com> <mailto:
binxiangni at gmail.com
>     <mailto:binxiangni at gmail.com>> <mailto:binxiangni at gmail.com <mailto:
binxiangni at gmail.com>
>      >     <mailto:binxiangni at gmail.com <mailto:binxiangni at gmail.com>>>>
wrote:
>      >      >
>      >      >     Hi,
>      >      >
>      >      >     I am working on fixing sparse matrix conversion for
RcppArmadillo. Now a problem comes up to me: what kind of sparse matrix is
expected to
>     pass from
>      >      >     Armadillo to R? That is, what should the result of
wrap() be? dgCMatrix(if logical, lgCMatrix or ngCMatrix)  or their original
type?
>      >      >
>      >      >
>      >      > What do you mean by "their original type"?
>      >      >
>      >      > It seems that the correspondence is
>      >      > Armadillo           Matrix package
>      >      > sp_mat       <=> dgCMatrix
>      >      > sp_cx_mat <=> zgCMatrix
>      >      > sp_imat      <=> igCMatrix
>      >     I would also consider the format used in a package slam.
>      >     It simply stores the indexes and non-zero values in a triplet
(i,j,v).
>      >
>      >
>      > That is the format of the dgTMatrix class from the Matrix package
for R but not, as far as I can tell, in Armadillo.  A brief glance at the
Armadillo
>      > documentation indicates that sparse matrices are always in the
compressed sparse column (CSC) format.
>     Indeed, but nothing prevents Binxiang to develop a wrap() that will
convert
>     armadillo format to one or many of R formats, right?
>
>
> Why?
Sure, Matrix is very versatile and rich in features but the price for this
is its heavy weight.
It can take several seconds to load it up. On my rather mighty PC (Intel
Xeon E5-2609 v2 @ 2.50GHz with 16 GB of memory),
I have:
 > system.time(library(Matrix))
utilisateur     syst?me      ?coul?
       1.427       0.052       1.619

I don't have my laptop here but the load time can be longer.
While for slam it takes only a fraction of second:

 > system.time(library(slam))
utilisateur     syst?me      ?coul?
       0.012       0.000       0.011
When slam can suffice, why not to use it?

> Is there a reason for doing type conversion from the dgCMatrix format to
another format in an Rcpp wrap function instead of with the existing
functions
> from the Matrix package?
>
> Bear in mind that dgCMatrix is an efficient format both in terms of the
amount of memory required  (that's the "compressed" part of the name) and
in terms of
> performing operations with the matrix.  Most operations on sparse
matrices stored in the triplet format start by creating a CSC or CSR
(compressed sparse row)
> form of the matrix anyway.
In Matrix package, I presume?
Few basic operations that I have seen in slam, stay with triplet format.
So if a user did not load Matrix package and want to use e.g. slam format,
it would be great if wrap() could give him expected format.


------------------------------

_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

End of Rcpp-devel Digest, Vol 92, Issue 12
******************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20170614/7e595cd2/attachment-0001.html>


More information about the Rcpp-devel mailing list