[datatable-help] Is there any overhead to converting back and forth from a data.table to a data.frame?

Arunkumar Srinivasan aragorn168b at gmail.com
Mon Apr 7 20:25:36 CEST 2014


as.data.frame is a S3 with .data.table method and is definitely faster than data.frame(). But it still does copy(.). data.frame(.) would also convert strings to factors by default (if stringsAsFactors=TRUE).

The most efficient way to convert data.table to data.frame would be to do things by reference (in place). The code is already available in as.data.frame, just remove the copy(.):

# convert data.table to data.frame by reference
setDF <- function(x) {
    if (!is.data.table(x))
        stop("x must be a data.table")
    setattr(x, "row.names", .set_row_names(nrow(x)))
    setattr(x, "class", "data.frame")
    setattr(x, "sorted", NULL)
    setattr(x, ".internal.selfref", NULL)        
}
Now you’ve a function that’ll convert a data.table to data.frame by reference.

require(data.table)
dat <- data.table(x=1:5, y=6:10)
setDF(dat) # dat is now a data.frame
Probably we should export this function as well, like setDT so that users can switch between the two as they desire without hitting performance?


Arun

From: Chris Neff caneff at gmail.com
Reply: Chris Neff caneff at gmail.com
Date: April 7, 2014 at 5:32:47 PM
To: datatable-help at lists.r-forge.r-project.org datatable-help at lists.r-forge.r-project.org
Subject:  [datatable-help] Is there any overhead to converting back and forth from a data.table to a data.frame?  

I prefer data.tables for all the code processing I do.  But others on my team using my functions aren't comfortable with data.tables, so most of the libraries I write end with

 return(data.frame(DT))

Is there any copying or other overhead happening there? Since it inherits from data.frame, I think the answer is no.

Now, if I have a function that does such a return, but I wrap that itself in a data.table call:

data.table(func_that_returns_df())

Is there any inefficiency there?  Is there a difference between data.table() and as.data.table() here?
_______________________________________________  
datatable-help mailing list  
datatable-help at lists.r-forge.r-project.org  
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140407/6327b0c2/attachment.html>


More information about the datatable-help mailing list