<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><p>Thanks, that helped. To illustrate on your big data (from the first post), your question is:</p>
<pre><code>require(data.table) ## 1.9.3
set.seed(12312391)
data <- data.table(
group = sample(1e3,1e7,replace=T),
time = ceiling(runif(1e7, 0, 1e5)),
hit = rbinom(1e7, 1, p = 0.1),
key=c("group","time"))
system.time(ans1 <- d[(hit)][d,list(hittime=time),roll=-20,by=.EACHI]) ## 5.4 sec
system.time(ans2 <- d[(hit)][d,time,roll=-20,by=.EACHI]) ## 3.4 sec
setnames(ans2, 3L, "hittime")
setkey(ans1, NULL)
setkey(ans2, NULL)
identical(ans1, ans2) # [1] TRUE
</code></pre>
<p>Why this difference? And that’s a great question!</p>
<p>Note that this is not particularly due to you not setting name (because <code>[.data.table</code> is clever enough to remove names before to call <code>dogroups</code>). Just to be sure, we’ll do a check:</p>
<pre><code>system.time(ans3 <- d[(hit)][d,list(time),roll=-20,by=.EACHI]) ## 5.7 sec
setnames(ans3, 3L, "hittime")
setkey(ans3, NULL)
identical(ans1, ans3) # [1] TRUE
</code></pre>
<p>The difference comes from the <code>j-expression</code>’s difference in <code>list(.)</code> in both the slow cases.. For each group, in C-level, the <code>j-expression</code> is evaluated.. and in the slow cases it’s <code>eval(list(time))</code> and in the fast case, it’s <code>eval(time)</code> and my guess is that this difference in the call is what makes that difference..</p>
<p>It’d be easy to test this by writing a simple C-script and evaluating both expressions, but I don’t have the time to do that right now. However, here’s an alternate “easy-route” to verify. </p>
<pre><code>require(data.table) ## 1.9.3
DT <- data.table(x=rep(1:1e7, 2L), y=1L)
system.time(ans1 <- DT[, .N, by=x]) ## 3.5 sec
system.time(ans2 <- DT[, list(N = .N), by=x]) ## 5.8 sec
</code></pre>
<p>Basically, when <code>j-expression</code> is just 1 entry, we <em>could</em> gain some speedup by removing the <code>list()</code> that’s being wrapped around.. </p>
<p>It’d be great if you could cite this thread from the data.table mailing list and file an issue here: https://github.com/Rdatatable/data.table/issues?direction=desc&labels=&milestone=&page=1&sort=updated&state=open</p>
<p><style>body{font-family:Helvetica,Arial;font-size:13px}</style><style>body {
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
padding:1em;
margin:auto;
background:#fefefe;
}
h1, h2, h3, h4, h5, h6 {
font-weight: bold;
}
h1 {
color: #000000;
font-size: 28pt;
}
h2 {
border-bottom: 1px solid #CCCCCC;
color: #000000;
font-size: 24px;
}
h3 {
font-size: 18px;
}
h4 {
font-size: 16px;
}
h5 {
font-size: 14px;
}
h6 {
color: #777777;
background-color: inherit;
font-size: 14px;
}
hr {
height: 0.2em;
border: 0;
color: #CCCCCC;
background-color: #CCCCCC;
}
p, blockquote, ul, ol, dl, li, table, pre {
margin: 15px 0;
}
a, a:visited {
color: #4183C4;
background-color: inherit;
text-decoration: none;
}
#message {
border-radius: 6px;
border: 1px solid #ccc;
display:block;
width:100%;
height:60px;
margin:6px 0px;
}
button, #ws {
font-size: 12 pt;
padding: 4px 6px;
border-radius: 5px;
border: 1px solid #bbb;
background-color: #eee;
}
code, pre, #ws, #message {
font-family: Monaco;
font-size: 10pt;
border-radius: 3px;
background-color: #F8F8F8;
color: inherit;
}
code {
border: 1px solid #EAEAEA;
margin: 0 2px;
padding: 0 5px;
}
pre {
border: 1px solid #CCCCCC;
overflow: auto;
padding: 4px 8px;
}
pre > code {
border: 0;
margin: 0;
padding: 0;
}
#ws { background-color: #f8f8f8; }
table {
border-collapse: collapse;
font-family: Helvetica, arial, freesans, clean, sans-serif;
color: rgb(51, 51, 51);
font-size: 15px; line-height: 25px;
padding: 0; }
table tr {
border-top: 1px solid #cccccc;
background-color: white;
margin: 0;
padding: 0; }
table tr:nth-child(2n) {
background-color: #f8f8f8; }
table tr th {
font-weight: bold;
border: 1px solid #cccccc;
margin: 0;
padding: 6px 13px; }
table tr td {
border: 1px solid #cccccc;
margin: 0;
padding: 6px 13px; }
table tr th :first-child, table tr td :first-child {
margin-top: 0; }
table tr th :last-child, table tr td :last-child {
margin-bottom: 0; }
.send { color:#77bb77; }
.server { color:#7799bb; }
.error { color:#AA0000; }</style></p><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div> <div id="bloop_sign_1404169480694407936" class="bloop_sign"><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style="color:black"><br>From: <span style="color:black">Stavros Macrakis (Σταῦρος Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu">macrakis@alum.mit.edu</a><br>Reply: <span style="color:black">Stavros Macrakis (Σταῦρος Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu">macrakis@alum.mit.edu</a><br>Date: <span style="color:black">July 1, 2014 at 12:51:36 AM</span><br>To: <span style="color:black">Arunkumar Srinivasan</span> <a href="mailto:aragorn168b@gmail.com">aragorn168b@gmail.com</a><br>Cc: <span style="color:black">datatable-help@r-forge.wu-wien.ac.at</span> <a href="mailto:datatable-help@r-forge.wu-wien.ac.at">datatable-help@r-forge.wu-wien.ac.at</a><br>Subject: <span style="color:black"> Re: [datatable-help] Speeding up column references with roll <br></span></div><br> <blockquote type="cite" class="clean_bq"><span><div><div></div><div>
<title></title>
<div dir="ltr">
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(51,0,0)">
Thanks for your reply, but your code doesn't do the same thing as
mine. Here's a very small example of what I'm trying to do.</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(51,0,0)">
<br></div>
<div class="gmail_default" style="">
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"># Test data</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">> dd <-
data.table(groups=rep(1:2,each=4),time=1:8,hit=1:8%%3==0,key=c("groups","time"))</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">> dd</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> groups time
hit</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">1: 1 1
FALSE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">2: 1 2
FALSE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">3: 1 3
TRUE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">4: 1 4
FALSE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">5: 2 5
FALSE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">6: 2 6
TRUE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">7: 2 7
FALSE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">8: 2 8
FALSE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"># Desired output includes the time and the
corresponding roll time</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">> (res1 <-
dd[(hit)][dd,list(rolltime=time),roll=2,by=.EACHI][!<a href="http://is.na">is.na</a>(rolltime)])</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> groups time
rolltime</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">1: 1 3
3</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">2: 1 4
3</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">3: 2 6
6</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">4: 2 7
6</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">5: 2 8
6</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"># Undesired output (without
.EACHI)</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">> (res2 <-
dd[hit==1][dd,list(rolltime=time),roll=2][!<a href="http://is.na">is.na</a>(rolltime)])</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"> rolltime</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">1: 1</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">2: 2</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">3: 3</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">4: 4</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">5: 5</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">6: 6</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">7: 7</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">8: 8</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"># Undesired output (with
allow.cartesian)</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">> res3 <-
dd[hit==1][dd,list(rolltime=time),roll=2,allow.cartesian=TRUE][!<a href="http://is.na">is.na</a>(rolltime)])</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">> identical(res2,res3)</font></div>
<div class="gmail_default" style=""><font color="#330000" face="courier new, monospace">[1] TRUE</font></div>
<div class="gmail_default" style=""><font color="#330000" face="georgia, serif"><br></font></div>
<div class="gmail_default" style="color:rgb(51,0,0);font-family:georgia,serif;font-size:small">Re
rolltime vs. time, consider the following </div>
<div class="gmail_default" style="color:rgb(51,0,0);font-family:georgia,serif;font-size:small">
<br></div>
<div class="gmail_default" style="color:rgb(51,0,0);font-size:small">
<div class="gmail_default"><font face="courier new, monospace">>
dd[(hit)][dd,time,roll=2,by=.EACHI]</font></div>
<div class="gmail_default"><font face="courier new, monospace"> groups time time</font></div>
<div class="gmail_default"><font face="courier new, monospace">1:
1 1 NA</font></div>
<div class="gmail_default"><font face="courier new, monospace">2:
1 2 NA</font></div>
<div class="gmail_default"><font face="courier new, monospace">3:
1 3 3</font></div>
<div class="gmail_default"><font face="courier new, monospace">4:
1 4 3</font></div>
<div class="gmail_default"><font face="courier new, monospace">5:
2 5 NA</font></div>
<div class="gmail_default"><font face="courier new, monospace">6:
2 6 6</font></div>
<div class="gmail_default"><font face="courier new, monospace">7:
2 7 6</font></div>
<div class="gmail_default"><font face="courier new, monospace">8:
2 8 6</font></div>
<div class="gmail_default" style="font-family:georgia,serif">
<br></div>
<div class="gmail_default" style="font-family:georgia,serif">There
are two different output columns named 'time'. One is the time from
the right relation of the join, the other is the time from the left
relation of the join. There is nothing like the i.time convention
for distinguishing the time that comes from one of the tables from
the (rolled) time that comes from the other.</div>
<div class="gmail_default" style="font-family:georgia,serif">
<br></div>
<div class="gmail_default" style="font-family:georgia,serif">
-s</div>
</div>
<div class="gmail_default" style="">
<div class="gmail_default" style="color:rgb(51,0,0);font-size:small"><br></div>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Mon, Jun 30, 2014 at 5:34 PM, Arunkumar
Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<p>Your example doesn’t work without
<code>allow.cartesian=TRUE</code>.</p>
<p>You <em>shouldn’t</em> be using <code>by=.EACHI</code> here.
This <code>by</code> was what was implicit in the earlier versions
which made it slow. Please re-read the README.</p>
<p>Here’s the function I tested on 1.9.3:</p>
<pre><code>calc1 <- function(d) {
d[ hit==1][ d,list(hittime=time),roll=-20, allow.cartesian=TRUE][ !<a href="http://is.na" target="_blank">is.na</a>(hittime)]
}
calc2 <- function(d) {
temp <- d[ hit==1][ d,list(time),roll=-20, allow.cartesian=TRUE]
setnames(temp,1,"hittime")
temp[!<a href="http://is.na" target="_blank">is.na</a>(hittime)]
}
# Generate sample data
set.seed(12312391)
data <- data.table(
group = sample(1e3,1e7,replace=T),
time = ceiling(runif(1e7, 0, 1e5)),
hit = rbinom(1e7, 1, p = 0.1),
key=c("group","time"))
system.time(ans1 <- calc1(data))
# user system elapsed
# 2.083 0.189 2.344
system.time(ans2 <- calc2(data))
# user system elapsed
# 2.012 0.241 2.426
identical(ans1, ans2) # [1] TRUE
</code>
</pre>
<div class="">
<pre><code>You write:
I also don't see any way to refer to the different time vs. hittime without renaming the second time column.
</code>
</pre></div>
<p>I don’t quite follow what this means, but IIUC I think this is
what you’re referring to: <a href="https://github.com/Rdatatable/data.table/issues/471" target="_blank">https://github.com/Rdatatable/data.table/issues/471</a></p>
<div class="">
<pre><code>You write:
You mention some FR's, but they're hard to find without the specific numbers.
</code>
</pre></div>
<p>I was mentioning the first two points under <strong>NEW
FEATURES</strong> within <code>Changes in v1.9.3</code>. The one
that starts with <code>by=.EACHI runs j for each group in x that
each row of i joins to.</code> and the one that starts with
<code>Accordingly, X[Y, j] now does what X[Y][, j] did.</code></p>
<p>Maybe we should start numbering the fixes for easy reference.
Will note it down.</p>
<pre><code>You write: Where can I find the 1.9.3 reference manual?
</code>
</pre>
<p>This version is a development version. Necesary changes will be
reflected in their corresponding <code>?...</code> entry. And when
we find some time, the introduction and FAQs will be updated. But
that’s not yet.</p>
<p>If you don’t wish to keep up-to-date by looking at the NEWS,
you’ll have to wait until the next stable release on CRAN.</p>
<pre><code>You write: On my system (MacOSX), build_vignettes=TRUE gives an error in texi2dvi -- would that have generated the refman? If so, how do I fix that?
</code>
</pre>
<p>I’m guessing it’s a PDF latex error. If so, you’ll have to
install what the error message says is missing on your system.
Sorry, can’t help you much there.</p>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
<br></div>
<div>
<div style="font-family:helvetica,arial;font-size:13px">Arun</div>
</div>
<div style="color:black">
<div class=""><br>
From: <span style="color:black">Stavros Macrakis (Σταῦρος
Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu" target="_blank">macrakis@alum.mit.edu</a><br>
Reply: <span style="color:black">Stavros Macrakis (Σταῦρος
Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu" target="_blank">macrakis@alum.mit.edu</a><br></div>
Date: <span style="color:black">June 30, 2014 at 10:40:24
PM</span><br>
To: <span style="color:black">Arunkumar Srinivasan</span>
<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a><br>
Cc: <span style="color:black"><a href="mailto:datatable-help@r-forge.wu-wien.ac.at" target="_blank">datatable-help@r-forge.wu-wien.ac.at</a></span> <a href="mailto:datatable-help@r-forge.wu-wien.ac.at" target="_blank">datatable-help@r-forge.wu-wien.ac.at</a><br>
Subject: <span style="color:black">Re: [datatable-help]
Speeding up column references with roll<br></span></div>
<div>
<div class="h5"><br>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000"><span>OK,
I'm retesting in 1.9.3, adding by=.EACHI. I don't see any
significant difference in the timings -- setnames is still 25%
faster than list(hittime=time). What exactly was
fixed?</span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span><br></span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000"><span>I
also don't see any way to refer to the different time vs. hittime
without renaming the second time column.</span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span><br></span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000"><span>You
mention some FR's, but they're hard to find without the specific
numbers.</span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span><br></span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span>Where can I find the 1.9.3 reference manual? I think it would
be easier to understand for me than the incremental changes in the
New Features listings. On my system (MacOSX), build_vignettes=TRUE
gives an error in texi2dvi -- would that have generated the refman?
If so, how do I fix that?</span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span><br></span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span>Thanks,</span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span><br></span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:#330000">
<span>
-s</span></div>
</div>
<div class="gmail_extra"><span><br>
<br></span>
<div class="gmail_quote"><span>On Mon, Jun 30, 2014 at 1:00 PM,
Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br></span>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
Once again, has been fixed in 1.9.3. Now join requires `by=.EACHI`
(explicit) to perform a by-without-by.</div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
<a href="https://github.com/Rdatatable/data.table/blob/master/README.md" target="_blank">https://github.com/Rdatatable/data.table/blob/master/README.md</a></div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
Have a look at the first FR (by = .EACHI runs ...) that's been
fixed in 1.9.3 - there's some changes in the way join results in
due to these changes (which've been discussed since and for quite
sometime) to bring more consistency to the DT[i, j, by] syntax.
Also have a look at the second FR and the links it points to for
the discussions.</div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
<br></div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
In general, it's better to test with the devel version (and have a
look at README) for any bugs you may encounter.</div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
<br></div>
<div>
<div style="font-family:helvetica,arial;font-size:13px">Arun</div>
</div>
<div style="color:black"><br>
From: <span style="color:black">Stavros Macrakis (Σταῦρος
Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu" target="_blank">macrakis@alum.mit.edu</a><br>
Reply: <span style="color:black">Stavros Macrakis (Σταῦρος
Μακράκης)</span> <a href="mailto:macrakis@alum.mit.edu" target="_blank">macrakis@alum.mit.edu</a><br>
Date: <span style="color:black">June 30, 2014 at 5:38:10
PM</span><br>
To: <span style="color:black"><a href="mailto:datatable-help@r-forge.wu-wien.ac.at" target="_blank">datatable-help@r-forge.wu-wien.ac.at</a></span> <a href="mailto:datatable-help@r-forge.wu-wien.ac.at" target="_blank">datatable-help@r-forge.wu-wien.ac.at</a><br>
Subject: <span style="color:black">[datatable-help] Speeding
up column references with roll<br></span></div>
<br>
<blockquote type="cite">
<div>
<div>
<div>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(51,0,0)">
<span>In the following example, it is about 15-25% faster to use
setnames rather than j=list(name=var). Is there some better
approach to referencing the other joined column when using
roll?</span></div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(51,0,0)">
<span><br></span></div>
<div class="gmail_default">
<div class="gmail_default"><span><span style="color:rgb(51,0,0);font-family:'courier new',monospace"># Use
j=list(name=var)</span><br></span></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">calc1 <- function(d) {</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> d[ hit==1</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> ][
d,list(hittime=time),roll=-20</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> ][ !<a href="http://is.na" target="_blank">is.na</a>(hittime)</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> ]</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">}</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"># Use setnames</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">calc2 <- function(d) {</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> temp <- d[ hit==1</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">
][ d,time,roll=-20</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">
]</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">
setnames(temp,3,"hittime")</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> temp[!<a href="http://is.na" target="_blank">is.na</a>(hittime)]</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">}</font></div>
<div class="gmail_default" style="color:rgb(51,0,0);font-family:georgia,serif;font-size:small">
<br></div>
</div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"># Generate sample data</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">set.seed(12312391)</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace">data <- data.table(</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> group =
sample(1e3,1e7,replace=T),</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> time =
ceiling(runif(1e7, 0, 1e5)),</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> hit =
rbinom(1e7, 1, p = 0.1),</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"> key=c("group","time"))</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"># Timing</font></div>
<div class="gmail_default"><font color="#330000" face="courier new, monospace"><br></font></div>
<div class="gmail_default"><span style="color:rgb(51,0,0);font-family:'courier new',monospace">system.time(replicate(10,{gc();calc1(data)}))
=> 69 sec system.time(replicate(10,{gc();calc2(data)})) => 52
sec</span><br></div>
</div>
</div>
</div>
_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div></div></span></blockquote><p></p></body></html>