<div dir="ltr">As of 2.15.1, data.frame appears to no longer be O(n^2) in the number of columns in the frame. That's certainly an improvement, yes.<div><br></div><div>However, by eliminating calls to data.frame and replacing them with direct class modifications, I can take a routine which takes minutes and reduce it to a routine which takes seconds. So, pragmatically, in Rcpp, I can get a rough factor of sixty, it appears.</div>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jan 17, 2013 at 7:46 PM, Paul Johnson <span dir="ltr"><<a href="mailto:pauljohn32@gmail.com" target="_blank">pauljohn32@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Tue, Jan 15, 2013 at 9:20 AM, John Merrill <<a href="mailto:john.merrill@gmail.com">john.merrill@gmail.com</a>> wrote:<br>
> It appears that DataFrame::create is a thin layer on top of the R data.frame<br>
> call. The guarantee correctness, but also means the performance of an Rcpp<br>
> routine which returns a large data frame is limited by the performance of<br>
> data.frame -- which is utterly horrible.<br>
<br>
</div>Are you certain that this claim is still true?<br>
<br>
I was shocked/surprised by the package "dataframe" and the commentary<br>
about it. The author said that data.frame was slow because "This<br>
contains versions of standard data frame functions in R, modified to<br>
avoid making extra copies of inputs. This is faster, particularly for<br>
large data."<br>
<br>
it was repeatedly copying some objects and he proved a substantially<br>
faster approach.<br>
<br>
In the release notes for R-2.15.1, I recall seeing a note that R Core<br>
had responded by integrating several of those changes. But still<br>
data.frame is not fast for you?<br>
<br>
If they didn't make the core data.frame as fast, would you care to<br>
enlighten us by installing the dataframe package and letting us know<br>
if it is still faster?<br>
<br>
Or perhaps you are way ahead of me and you've already imitated<br>
Hesterberg's algorithms in your C++ design?<br>
<span class="HOEnZb"><font color="#888888"><br>
pj<br>
<br>
--<br>
Paul E. Johnson<br>
Professor, Political Science Assoc. Director<br>
1541 Lilac Lane, Room 504 Center for Research Methods<br>
University of Kansas University of Kansas<br>
<a href="http://pj.freefaculty.org" target="_blank">http://pj.freefaculty.org</a> <a href="http://quant.ku.edu" target="_blank">http://quant.ku.edu</a><br>
</font></span></blockquote></div><br></div>