[Rcpp-devel] Efficient DataFrame access by row & column

I have a need to loop through all the entries of a DataFrame by row, then column.  I know two different ways:

  // Case A: When df.length() is unknown at coding time:
  int n = df.nrows();
  int m = df.length();
  for(int i=1; i<n; i++) {
    for(int j=0; j<m; j++) {
      NumericVector v = df[j];
      // ... do stuff with v[i] ...

  // Case B: If I know the number of columns while writing the C code:
  int n = df.nrows();
  NumericVector xs = df[0];
  NumericVector ys = df[1];
  for(int i=1; i<n; i++) {
    // ... do stuff with xs[i] and ys[i] ...

The second way is less flexible, but it's also quite a bit faster in practice - I presume this means the "NumericVector ..." expressions are doing a non-trivial amount of work (perhaps even copying the whole vector?).

Is there a way to have my cake & eat it?  Can I efficiently (O[1]) index into a DataFrame by numeric row index and numeric column index?

I'm also curious why it's a syntax error in Case A to just write `df[j][i]` or even `((NumericVector) df[j])[i]`  - clearly there's magic behind the "NumericVector" call that I don't understand.


