[Rcpp-devel] Performance question about DataFrame

Davor Cubranic cubranic at stat.ubc.ca
Thu Feb 7 21:28:57 CET 2013


I come late to this discussion, but it should be pointed out that using "sprintf" without ensuring that your buffer is long enough is not a "subtlety" but a bug.

A more "C++ way" to do it, and most importantly safer, would be to use std::ostringstream:

for (int i = 0; i < nrows; i++) {
  std::ostringstream rowname;
  rowname << i;
  row_names(j) = rowname;
}
for (int j = 0; j < ncols; j++) {
  std::ostringstream colname;
  colname << < "X." << j;
  col_names(j) = colname;
}

Davor


On 2013-01-18, at 3:25 PM, John Merrill wrote:

> Sure.  I'll write something up for the gallery, but here's the crude outline.
> 
> Here's the C++ code:
> #include <Rcpp.h>
> 
> using namespace Rcpp;
> 
> // [[Rcpp::export]]                                                                                                                                             
> List BuildCheapDataFrame(List a) {
>   List returned_frame = clone(a);
>   GenericVector sample_row = returned_frame(1);
> 
>   StringVector row_names(sample_row.length());
>   for (int i = 0; i < sample_row.length(); ++i) {
>     char name[5];
>     sprintf(&(name[0]), "%d", i);
>     row_names(i) = name;
>   }
>   returned_frame.attr("row.names") = row_names;
> 
>   StringVector col_names(returned_frame.length());
>   for (int j = 0; j < returned_frame.length(); ++j) {
>     char name[6];
>     sprintf(&(name[0]), "X.%d", j);
>     col_names(j) = name;
>   }
>   returned_frame.attr("names") = col_names;
>   returned_frame.attr("class") = "data.frame";
> 
>   return returned_frame;
> }
> There are some subtleties in this code:
> 
> * It turns out that one can't send super-large data frames to it because of possible buffer overflows.  I've never seen that problem when I've written Rcpp functions which exchanged SEXPs with R, but this one uses Rcpp:export in order to use sourceCpp.
> * Notice the invocation of clone() in the first line of the code.  If you don't do that, you wind up side-effecting the parameter, which is not what most people would expect.
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130207/c40f03d9/attachment-0001.html>


More information about the Rcpp-devel mailing list