<div dir="ltr">I would appreciate such a function, yes. Thanks for the explanation.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Apr 7, 2014 at 2:25 PM, Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><p><code>as.data.frame</code> is a S3 with <code>.data.table</code> method and is definitely faster than <code>data.frame()</code>. But it still does <code>copy(.)</code>. <code>data.frame(.)</code> would also convert strings to factors by default (if stringsAsFactors=TRUE).</p>
<p>The most efficient way to convert <code>data.table</code> to <code>data.frame</code> would be to do things by reference (in place). The code is already available in <code>as.data.frame</code>, just remove the <code>copy(.)</code>:</p>
<pre><code># convert data.table to data.frame by reference
setDF <- function(x) {
if (!is.data.table(x))
stop("x must be a data.table")
setattr(x, "row.names", .set_row_names(nrow(x)))
setattr(x, "class", "data.frame")
setattr(x, "sorted", NULL)
setattr(x, ".internal.selfref", NULL)
}
</code></pre>
<p>Now you’ve a function that’ll convert a <code>data.table</code> to <code>data.frame</code> <em>by reference</em>.</p>
<pre><code>require(data.table)
dat <- data.table(x=1:5, y=6:10)
setDF(dat) # dat is now a data.frame
</code></pre>
<p>Probably we should export this function as well, like <code>setDT</code> so that users can switch between the two as they desire without hitting performance?</p>
<p></p><div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto"><br></div> <div><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style><br>From: <span style>Chris Neff</span> <a href="mailto:caneff@gmail.com" target="_blank">caneff@gmail.com</a><br>
Reply: <span style>Chris Neff</span> <a href="mailto:caneff@gmail.com" target="_blank">caneff@gmail.com</a><br>Date: <span style>April 7, 2014 at 5:32:47 PM</span><br>To: <span style><a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a></span> <a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a><br>
Subject: <span style> [datatable-help] Is there any overhead to converting back and forth from a data.table to a data.frame? <br></span></div><br> <blockquote type="cite"><span><div><div></div><div><div><div class="h5">
<div dir="ltr">I prefer data.tables for all the code processing I
do. But others on my team using my functions aren't
comfortable with data.tables, so most of the libraries I write end
with<br>
<br>
<div> return(data.frame(DT))</div>
<div><br></div>
<div>Is there any copying or other overhead happening there? Since
it inherits from data.frame, I think the answer is no.</div>
<div><br></div>
<div>Now, if I have a function that does such a return, but I wrap
that itself in a data.table call:</div>
<div><br></div>
<div>data.table(func_that_returns_df())</div>
<div><br></div>
<div>Is there any inefficiency there? Is there a difference
between data.table() and as.data.table() here?</div>
</div></div></div>
_______________________________________________
<br>datatable-help mailing list
<br><a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a>
<br><a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></div></div></span></blockquote><p>
</p></div></blockquote></div><br></div>