[Rcpp-devel] efficient ingestion of "sparse csv"

Vincent Carey stvjc at channing.harvard.edu
Tue May 11 06:27:55 CEST 2021


Thanks Dirk, lots of useful information there.  I wonder whether the sparse
ingestion
problem would best be solved with multiple passes -- it seems one would want
to learn the dimensions and the number of nonzero elements per row to
allocate the index vectors, and then populate them and the data vector with
a final pass.
Or one could use a buffering strategy to grow the index vectors as needed
in a
one-pass approach.

On Mon, May 10, 2021 at 11:19 PM Dirk Eddelbuettel <edd at debian.org> wrote:

>
> Vincent,
>
> In the broad terms of the question the best answer may be a simple "sure".
> More seriously, there have been many approaches.  Consider for example the
> recent Rcpp Gallery post lead by Zach (with some edits by me):
>   https://gallery.rcpp.org/articles/sparse-matrix-class/
>
> It's focus on not copying <i,p,x> again if we already have them as R
> vectors,
> which is a fair point. If the goal is to get to SuperLU via (Rcpp)Armadillo
> then I do not think you can avoid the (internal) copies.  As always, the
> answer may be "it depends".
>
> Hope this helps, happy to refine,  Dirk
>
> --
> https://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>

-- 
The information in this e-mail is intended only for the person to whom it 
is
addressed. If you believe this e-mail was sent to you in error and the 
e-mail
contains patient information, please contact the Partners Compliance 
HelpLine at
http://www.partners.org/complianceline 
<http://www.partners.org/complianceline> . If the e-mail was sent to you in 
error
but does not contain patient information, please contact the sender 
and properly
dispose of the e-mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210511/1f155e20/attachment.html>


More information about the Rcpp-devel mailing list