<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">Once again, has been fixed in 1.9.3. Now join requires `by=.EACHI` (explicit) to perform a by-without-by.</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><a href="https://github.com/Rdatatable/data.table/blob/master/README.md">https://github.com/Rdatatable/data.table/blob/master/README.md</a></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">Have a look at the first FR (by = .EACHI runs ...) that's been fixed in 1.9.3 - there's some changes in the way join results in due to these changes (which've been discussed since and for quite sometime) to bring more consistency to the DT[i, j, by] syntax. Also have a look at the second FR and the links it points to for the discussions.</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">In general, it's better to test with the devel version (and have a look at README) for any bugs you may encounter.</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div> <div id="bloop_sign_1404147394382735104" class="bloop_sign"><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style="color:black"><br>From: <span style="color:black">Stavros Macrakis (Σταῦρος Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu">macrakis@alum.mit.edu</a><br>Reply: <span style="color:black">Stavros Macrakis (Σταῦρος Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu">macrakis@alum.mit.edu</a><br>Date: <span style="color:black">June 30, 2014 at 5:38:10 PM</span><br>To: <span style="color:black">datatable-help@r-forge.wu-wien.ac.at</span> <a href="mailto:datatable-help@r-forge.wu-wien.ac.at">datatable-help@r-forge.wu-wien.ac.at</a><br>Subject: <span style="color:black"> [datatable-help] Speeding up column references with roll <br></span></div><br> <blockquote type="cite" class="clean_bq"><span><div><div></div><div>
<title></title>
<div dir="ltr">
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(51,0,0)">In
the following example, it is about 15-25% faster to use setnames
rather than j=list(name=var). Is there some better approach to
referencing the other joined column when using roll?</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(51,0,0)">
<br></div>
<div class="gmail_default" style="">
<div class="gmail_default" style=""><span style="color:rgb(51,0,0);font-family:'courier new',monospace"># Use
j=list(name=var)</span><br></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">calc1 <- function(d) {</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> d[ hit==1</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> ][
d,list(hittime=time),roll=-20</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> ][ !<a href="http://is.na">is.na</a>(hittime)</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> ]</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">}</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"># Use setnames</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">calc2 <- function(d) {</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> temp <- d[ hit==1</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">
][ d,time,roll=-20</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">
]</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">
setnames(temp,3,"hittime")</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> temp[!<a href="http://is.na">is.na</a>(hittime)]</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">}</font></div>
<div class="gmail_default" style="color:rgb(51,0,0);font-family:georgia,serif;font-size:small">
<br></div>
</div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"># Generate sample data</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">set.seed(12312391)</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">data <- data.table(</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> group =
sample(1e3,1e7,replace=T),</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> time =
ceiling(runif(1e7, 0, 1e5)),</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> hit =
rbinom(1e7, 1, p = 0.1),</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> key=c("group","time"))</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"># Timing</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><span style="color:rgb(51,0,0);font-family:'courier new',monospace">system.time(replicate(10,{gc();calc1(data)}))
=> 69 sec system.time(replicate(10,{gc();calc2(data)})) => 52
sec</span><br></div>
</div>
_______________________________________________
<br>datatable-help mailing list
<br>datatable-help@lists.r-forge.r-project.org
<br>https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</div></div></span></blockquote></body></html>