[Rcpp-devel] Efficient DataFrame access by row & column

Ken Williams Ken.Williams at windlogics.com
Wed Feb 20 01:33:25 CET 2013


I made one more alternative, countSteps4(), and it seems to do pretty well:

  https://gist.github.com/kenahoo/4991485

Curious that it performs much better than the NumericMatrix version.  Does anyone have insight why?

-Ken

From: rcpp-devel-bounces at lists.r-forge.r-project.org [mailto:rcpp-devel-bounces at lists.r-forge.r-project.org] On Behalf Of Ken Williams
Sent: Tuesday, February 19, 2013 6:24 PM
To: Kevin Ushey
Cc: rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

That may be a good option if I'm willing to make at least one copy, right at the beginning, to create the NumericMatrix.  I'd prefer to even avoid that copy if possible, but if it must be, it must be.

It performs somewhere between the other two options:

  https://gist.github.com/kenahoo/4991485

-Ken


From: Kevin Ushey [mailto:kevinushey at gmail.com]
Sent: Tuesday, February 19, 2013 6:07 PM
To: Ken Williams
Cc: John Merrill; rcpp-devel at lists.r-forge.r-project.org<mailto:rcpp-devel at lists.r-forge.r-project.org>
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

Another thing worth thinking about: perhaps the easiest way to side-step the issue is to work with a NumericMatrix rather than a DataFrame. At least, from the example you gave, it sounds like a container where you have the expectation that each column is a NumericVector of equal length.

If you can make the switch to NumericMatrix, then you can generate and operate with row/column views, e.g. NumericMatrix::Row and NumericMatrix::Column, which will generate references to rows / columns and hence avoid copying. (These are generated whenever you do e.g. x(i, _) or x(_, i) on a NumericMatrix x).

-Kevin

On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams <Ken.Williams at windlogics.com<mailto:Ken.Williams at windlogics.com>> wrote:


> From: John Merrill [mailto:john.merrill at gmail.com<mailto:john.merrill at gmail.com>]
> Sent: Tuesday, February 19, 2013 5:24 PM
> To: Ken Williams
> Cc: Yan Zhou; Dirk Eddelbuettel; rcpp-devel at lists.r-forge.r-project.org<mailto:rcpp-devel at lists.r-forge.r-project.org>
> Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column
>
> I'm a little puzzled by your question.  Could you use a reference instead of instantiating a new copy?
I would love to use a reference, but I don't know how.  That's in fact the essence of my question. =)

Is there already some example code somewhere showing how to get reference to a DataFrame column without copying?  I must be just missing it.

 -Ken


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org<mailto:Rcpp-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130220/a8599df3/attachment.html>


More information about the Rcpp-devel mailing list