[Rcpp-devel] Efficient DataFrame access by row & column
Yan Zhou
zhouyan at me.com
Wed Feb 20 00:09:52 CET 2013
The most inefficient part I see is the creation of a new NumericVector inside the inner most loop. You copied each column n times, of which n-1 times are unnecessary.
Yan Zhou
On Feb 19, 2013, at 11:02 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> Ken,
>
> On 19 February 2013 at 22:35, Ken Williams wrote:
> | I have a need to loop through all the entries of a DataFrame by row, then
> | column. I know two different ways:
>
> There have been prior discussions of this topic, as well as example posts --
> even leading to a Rcpp Gallery article. Did you read any of these? It wasn't
> clear from your post.
>
> | I?m also curious why it?s a syntax error in Case A to just write `df[j][i]` or
>
> Eeeek. I prefer the more C++-y way of writing df(j,i). Square brackets only
> work for vectors, and even then you may be better off with x(i) for
> consistency.
>
> Overall, your premise may be wrong too. "We all know" that a data.frame is
> not the fastest data structure in R, so by forcing ourselves to the same
> access are we not handycapping ourselves.
>
> Once you are in C++, you can use whatever C++ datatype you like. A
> data.frame really is just a list of vectors, each of the vectors has eg a
> begin(0 iterator which you can (fairly costlessly) instantiate STL types.
>
> And those give you performance guarantees.
>
> Hope this helps, Dirk
>
> --
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
More information about the Rcpp-devel
mailing list