[Rcpp-devel] Speed up of the data.frame creation in DataFrame.h
Dirk Eddelbuettel
edd at debian.org
Mon Jun 9 00:42:46 CEST 2014
On 8 June 2014 at 13:33, Christian Gunning wrote:
| Hello list, long time no see!
Indeed :)
| Dmitry,
| Have you identified any other consequences than what Romain pointed
| out? This information would be useful for the rest of us.
|
| Some key points that I agree with:
| * as per Dirk: this is a nice little piece of sleuthing. Your
| benchmarking shows that the effect is significant.
| * as per your comments: a key intent of Rcpp is allow the user the
| freedom to acheive optimization and do their own error checking.
| * as per Romain: let's not break things.
|
| It seems possible address all of these points, perhaps with a
| dedicated function, as per your comments. I can help with this, if
| you're interested.
Also note the exisiting 'faster data.frame creation from lists' article at
the Rcpp Gallery:
http://gallery.rcpp.org/articles/faster-data-frame-creation/
I only looked at it briefly in this context, and it seems that some changes
we made to Rcpp since this Gallery post was written make its suggestion a
little less dramatic. But I did not have time to dig deeper.
But your point two here is key: Rcpp is a _framework_ and users are of
course free to add little helper functions at their end -- and even
encouraged to do so. Not everything has to happen at the Rcpp side of things.
| Key question: what is the intended behavior of this function? E.g.,
| throw an exception on length mismatch? My vote is for a limited
| function that deals with a limited number of use cases and provides
| reasonable error-checking (e.g. throws exception for input outside
| scope), versus a logic-heavy function that handles recycling, for
| example. Does this match your use-case?
Good ideas.
Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
More information about the Rcpp-devel
mailing list