[Rcpp-devel] Sparse matrices with RcppArmadillo

Dirk Eddelbuettel edd at debian.org
Mon Dec 24 01:17:15 CET 2012


On 9 December 2012 at 14:56, Romain Francois wrote:
| Thanks for the valuable information.
| 
| I will look into it more seriously at some point when I have some more 
| time.

I have a minor improvement now, based on poking around in SpMat_meat.hpp and
the previous discussion, particularly Conrad's hints still cited below.  

It is still pretty raw.  The main problem I have right is that when coming in
as R objects (even with the internal Compressed Sparse Column format), the
row and col vectors are R integers and hence unsigned.  So the 'block copy'
that Conrad uses does not work.  I'm sure Romain already has a trick for that
in other places.

But this demonstrates that we're getting close, and that it probably would
not take much in terms of resources to provide more solid sparse matrix
support. 

Dirk

Code below:

// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; indent-tabs-mode: nil; -*-

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp ;

// [[Rcpp::export]]
void convertA(S4 mat) {         // Romain's version from Dec 8
    IntegerVector dims = mat.slot("Dim");
    IntegerVector i = mat.slot("i");
    IntegerVector p = mat.slot("p");
    NumericVector x = mat.slot("x");

    int nrow = dims[0], ncol = dims[1];
    arma::sp_mat res(nrow, ncol);

    // this is not the proper way to initialize a sparse arma matrix
    // better code should be forthcoming at some point in the future
    for(int j = 0; j < ncol; ++j) {
        for (int k = p[j]; k < p[j + 1]; ++k) 
            res(i[k], j) = x[k];
    }
    std::cout << res << std::endl;
}


// [[Rcpp::export]]
void convertB(S4 mat) {         // slight improvement with two non-nested loops
    IntegerVector dims = mat.slot("Dim");
    IntegerVector i = mat.slot("i");
    IntegerVector p = mat.slot("p");
    NumericVector x = mat.slot("x");

    int nrow = dims[0], ncol = dims[1];
    arma::sp_mat res(nrow, ncol);

    // create space for values, and copy
    arma::access::rw(res.values) = arma::memory::acquire_chunked<double>(x.size() + 1);
    arma::arrayops::copy(arma::access::rwp(res.values), x.begin(), x.size() + 1);

    // create space for row_indices, and copy -- so far in a lame loop
    arma::access::rw(res.row_indices) = arma::memory::acquire_chunked<arma::uword>(x.size() + 1);
    for (int j=0; j<i.size(); j++) arma::access::rwp(res.row_indices)[j] = i[j];

    // create space for col_ptrs, and copy -- so far in a lame loop
    arma::access::rw(res.col_ptrs) = arma::memory::acquire<arma::uword>(p.size() + 2);
    for (int j=0; j<p.size(); j++) arma::access::rwp(res.col_ptrs)[j] = p[j];

    // important: set the sentinel as well
    arma::access::rwp(res.col_ptrs)[p.size()+1] = std::numeric_limits<arma::uword>::max();

    // set the number of non-zero elements
    arma::access::rw(res.n_nonzero) = x.size();

    std::cout << "SpMat res:\n" << res << std::endl;
}



/*** R
    require(Matrix)
    i <- c(1,3:8) 
    j <- c(2,9,6:10) 
    x <- 7 * (1:7)
    A <- sparseMatrix(i, j, x = x) 
    print(A)

    convertA(A)
    convertB(A)
***/



Output below:


R> sourceCpp("/tmp/armaSparse.cpp")

R>     require(Matrix)

R>     i <- c(1,3:8) 

R>     j <- c(2,9,6:10) 

R>     x <- 7 * (1:7)

R>     A <- sparseMatrix(i, j, x = x) 

R>     print(A)
8 x 10 sparse Matrix of class "dgCMatrix"
                             
[1,] . 7 . . .  .  .  .  .  .
[2,] . . . . .  .  .  .  .  .
[3,] . . . . .  .  .  . 14  .
[4,] . . . . . 21  .  .  .  .
[5,] . . . . .  . 28  .  .  .
[6,] . . . . .  .  . 35  .  .
[7,] . . . . .  .  .  . 42  .
[8,] . . . . .  .  .  .  . 49

R>     convertA(A)
[matrix size: 8x10; n_nonzero: 7; density: 8.75%]

     (0, 1)          7.0000
     (3, 5)         21.0000
     (4, 6)         28.0000
     (5, 7)         35.0000
     (2, 8)         14.0000
     (6, 8)         42.0000
     (7, 9)         49.0000


NULL

R>     convertB(A)
SpMat res:
[matrix size: 8x10; n_nonzero: 7; density: 8.75%]

     (0, 1)          7.0000
     (3, 5)         21.0000
     (4, 6)         28.0000
     (5, 7)         35.0000
     (2, 8)         14.0000
     (6, 8)         42.0000
     (7, 9)         49.0000


NULL
R> 


| 
| Romain
| 
| Le 09/12/12 14:40, c s a écrit :
| > Hi Romain & Doug,
| >
| > Two points to note:
| >
| > (1)
| > When poking around internal Armadillo pointers and arrays for sparse
| > matrices, please take into account that the col_ptrs array is slightly
| > longer than specified by the Compressed Sparse Column (CSC) format.
| >
| > However, this should not cause any problems when interfacing with
| > external libraries. If you use the set_size() function before writing
| > to the values[], row_indices[] and col_ptrs[] arrays, everything
| > should be okay.
| >
| > The col_ptrs array is slightly longer in order to allow robust
| > iterators.  Specifically, instead of the col_ptrs array having a size
| > of ncols+1 (as specified by CSC), it has a size of ncols+2.  The extra
| > element is always set to std::numeric_limits<uword>::max().  As such,
| > be careful not to overwrite this magic number.
| >
| > For more details within the code, see the function
| > SpMat<eT>::init(uword in_rows, uword in_cols), on line 3635 in
| > "SpMat_meat.hpp".
| >
| > (2)
| > If you need to allocate memory for the values[] and row_indices[]
| > arrays, you need to use the memory::acquire_chunked() function.  This
| > is an alternative memory allocation mechanism designed to reduce the
| > burden of inserting or removing a non-zero element.  The rest of the
| > sparse matrix memory management code relies on this.  See the code in
| > file "memory.hpp" for more details.
| >
| >
| > On Sun, Dec 9, 2012 at 5:55 AM, Romain Francois
| > <romain at r-enthusiasts.com> wrote:
| >> Yes. Conrad left a back door open so that we can add things **inside** the
| >> SpMat template, similar to what we have done in dense matrix classes.
| >>
| >> Going from there I was looking at the copy constructor of a SpMat, which
| >> eventually is a call to init:
| >>
| >> /**
| >>   * Copy from another matrix.
| >>   */
| >> template<typename eT>
| >> inline
| >> void
| >> SpMat<eT>::init(const SpMat<eT>& x)
| >>    {
| >>    arma_extra_debug_sigprint();
| >>
| >>    // Ensure we are not initializing to ourselves.
| >>    if (this != &x)
| >>      {
| >>      init(x.n_rows, x.n_cols);
| >>
| >>      // values and row_indices may not be null.
| >>      if (values != NULL)
| >>        {
| >>        memory::release(values);
| >>        memory::release(row_indices);
| >>        }
| >>
| >>      access::rw(values)      = memory::acquire_chunked<eT> (x.n_nonzero + 1);
| >>      access::rw(row_indices) = memory::acquire_chunked<uword>(x.n_nonzero +
| >> 1);
| >>
| >>      // Now copy over the elements.
| >>      arrayops::copy(access::rwp(values),      x.values,      x.n_nonzero +
| >> 1);
| >>      arrayops::copy(access::rwp(row_indices), x.row_indices, x.n_nonzero +
| >> 1);
| >>      arrayops::copy(access::rwp(col_ptrs),    x.col_ptrs,    x.n_cols + 1);
| >>
| >>      access::rw(n_nonzero) = x.n_nonzero;
| >>      }
| >>    }
| >>
| >> So it all looks very doable.
| >>
| >> Romain
| >
| 
| 
| -- 
| Romain Francois
| Professional R Enthusiast
| +33(0) 6 28 91 30 30
| 
| R Graph Gallery: http://gallery.r-enthusiasts.com
| 
| blog:            http://romainfrancois.blog.free.fr
| |- http://bit.ly/RE6sYH : OOP with Rcpp modules
| `- http://bit.ly/Thw7IK : Rcpp modules more flexible
| 
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  


More information about the Rcpp-devel mailing list