[Rcpp-devel] Performance question about DataFrame

Paul Johnson pauljohn32 at gmail.com
Fri Jan 18 04:46:43 CET 2013


On Tue, Jan 15, 2013 at 9:20 AM, John Merrill <john.merrill at gmail.com> wrote:
> It appears that DataFrame::create is a thin layer on top of the R data.frame
> call.  The guarantee correctness, but also means the performance of an Rcpp
> routine which returns a large data frame is limited by the performance of
> data.frame -- which is utterly horrible.

Are you certain that this claim is still true?

I was shocked/surprised by the package "dataframe" and the commentary
about it. The author said that data.frame was slow because "This
contains versions of standard data frame functions in R, modified to
avoid making extra copies of inputs. This is faster, particularly for
large data."

it was repeatedly copying some objects and he proved a substantially
faster approach.

In the release notes for R-2.15.1, I recall seeing a note that R Core
had responded by integrating several of those changes. But still
data.frame is not fast for you?

If they didn't make the core data.frame as fast, would you care to
enlighten us by installing the dataframe package and letting us know
if it is still faster?

Or perhaps you are way ahead of me and you've already imitated
Hesterberg's algorithms in your C++ design?

pj

-- 
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu


More information about the Rcpp-devel mailing list