[Rcpp-devel] Speed up of the data.frame creation in DataFrame.h

Dirk Eddelbuettel edd at debian.org
Mon Jun 9 00:42:46 CEST 2014


On 8 June 2014 at 13:33, Christian Gunning wrote:
| Hello list, long time no see!

Indeed :)
 
| Dmitry,
| Have you identified any other consequences than what Romain pointed
| out?  This information would be useful for the rest of us.
| 
| Some key points that I agree with:
|  * as per Dirk: this is a nice little piece of sleuthing.  Your
| benchmarking shows that the effect is significant.
|  * as per your comments: a key intent of Rcpp is allow the user the
| freedom to acheive optimization and do their own error checking.
|   * as per Romain: let's not break things.
| 
| It seems possible address all of these points, perhaps with a
| dedicated function, as per your comments.  I can help with this, if
| you're interested.

Also note the exisiting 'faster data.frame creation from lists' article at
the Rcpp Gallery:

  http://gallery.rcpp.org/articles/faster-data-frame-creation/

I only looked at it briefly in this context, and it seems that some changes
we made to Rcpp since this Gallery post was written make its suggestion a
little less dramatic.  But I did not have time to dig deeper.

But your point two here is key:  Rcpp is a _framework_ and users are of
course free to add little helper functions at their end -- and even
encouraged to do so.  Not everything has to happen at the Rcpp side of things.

| Key question: what is the intended behavior of this function?  E.g.,
| throw an exception on length mismatch?  My vote is for a limited
| function that deals with a limited number of use cases and provides
| reasonable error-checking (e.g. throws exception for input outside
| scope), versus a logic-heavy function that handles recycling, for
| example.  Does this match your use-case?

Good ideas.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org


More information about the Rcpp-devel mailing list