<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><p>Sorry. But we can simplify it even further:</p>
<p>The first step is just <code>unique(test)</code>. So, we can do:</p>
<pre><code>system.time({
ans = unique(test)
ans = ans[ans[, .I[.N > 1L], by=id]$V1]
})
# 0.016 0.000 0.016
</code></pre>
<p>Identical?</p>
<pre><code>setkey(ans)
setkey(ut1)
identical(ans, ut1) # [1] TRUE
</code></pre>
<p><style>body{font-family:Helvetica,Arial;font-size:13px}</style><style>body {
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
padding:1em;
margin:auto;
background:#fefefe;
}
h1, h2, h3, h4, h5, h6 {
font-weight: bold;
}
h1 {
color: #000000;
font-size: 28pt;
}
h2 {
border-bottom: 1px solid #CCCCCC;
color: #000000;
font-size: 24px;
}
h3 {
font-size: 18px;
}
h4 {
font-size: 16px;
}
h5 {
font-size: 14px;
}
h6 {
color: #777777;
background-color: inherit;
font-size: 14px;
}
hr {
height: 0.2em;
border: 0;
color: #CCCCCC;
background-color: #CCCCCC;
}
p, blockquote, ul, ol, dl, li, table, pre {
margin: 15px 0;
}
a, a:visited {
color: #4183C4;
background-color: inherit;
text-decoration: none;
}
#message {
border-radius: 6px;
border: 1px solid #ccc;
display:block;
width:100%;
height:60px;
margin:6px 0px;
}
button, #ws {
font-size: 12 pt;
padding: 4px 6px;
border-radius: 5px;
border: 1px solid #bbb;
background-color: #eee;
}
code, pre, #ws, #message {
font-family: Monaco;
font-size: 10pt;
border-radius: 3px;
background-color: #F8F8F8;
color: inherit;
}
code {
border: 1px solid #EAEAEA;
margin: 0 2px;
padding: 0 5px;
}
pre {
border: 1px solid #CCCCCC;
overflow: auto;
padding: 4px 8px;
}
pre > code {
border: 0;
margin: 0;
padding: 0;
}
#ws { background-color: #f8f8f8; }
table {
border-collapse: collapse;
font-family: Helvetica, arial, freesans, clean, sans-serif;
color: rgb(51, 51, 51);
font-size: 15px; line-height: 25px;
padding: 0; }
table tr {
border-top: 1px solid #cccccc;
background-color: white;
margin: 0;
padding: 0; }
table tr:nth-child(2n) {
background-color: #f8f8f8; }
table tr th {
font-weight: bold;
border: 1px solid #cccccc;
margin: 0;
padding: 6px 13px; }
table tr td {
border: 1px solid #cccccc;
margin: 0;
padding: 6px 13px; }
table tr th :first-child, table tr td :first-child {
margin-top: 0; }
table tr th :last-child, table tr td :last-child {
margin-bottom: 0; }
.send { color:#77bb77; }
.server { color:#7799bb; }
.error { color:#AA0000; }</style></p><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div> <div id="bloop_sign_1402713822374628096" class="bloop_sign"><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style="color:black"><br>From: <span style="color:black">Arunkumar Srinivasan</span> <a href="mailto:aragorn168b@gmail.com">aragorn168b@gmail.com</a><br>Reply: <span style="color:black">Arunkumar Srinivasan</span> <a href="mailto:aragorn168b@gmail.com">aragorn168b@gmail.com</a><br>Date: <span style="color:black">June 14, 2014 at 4:42:31 AM</span><br>To: <span style="color:black">Ron Hylton</span> <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a>, <span style="color:black">datatable-help@lists.r-forge.r-project.org</span> <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>Subject: <span style="color:black"> Re: [datatable-help] data.table is asking for help <br></span></div><br> <blockquote type="cite" class="clean_bq"><span><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div></div><div>
<title></title>
<p>A slightly simpler version of the 2nd solution is:</p>
<pre><code>system.time({
ans = test[, .N, by=names(test)]
ans = ans[ans[, .I[.N > 1L], by=id]$V1]
})
# 0.019 0.000 0.019
</code>
</pre>
<p>The answers are identical, you can check this by doing:</p>
<pre><code>ans[, N := NULL]
setkey(ans)
setkey(ut1)
identical(ans, ut1) # [1] TRUE
</code>
</pre>
<div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">
<br></div>
<div id="bloop_sign_1402713543682700032" class="bloop_sign">
<div style="font-family:helvetica,arial;font-size:13px">Arun</div>
</div>
<div style="color:black"><br>
From: <span style="color:black">Arunkumar Srinivasan</span>
<a href="mailto:aragorn168b@gmail.com">aragorn168b@gmail.com</a><br>
Reply: <span style="color:black">Arunkumar Srinivasan</span>
<a href="mailto:aragorn168b@gmail.com">aragorn168b@gmail.com</a><br>
Date: <span style="color:black">June 14, 2014 at 4:34:15
AM</span><br>
To: <span style="color:black">Ron Hylton</span> <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a>, <span style="color:black">datatable-help@lists.r-forge.r-project.org</span>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
Subject: <span style="color:black">Re: [datatable-help]
data.table is asking for help<br></span></div>
<br>
<blockquote type="cite" class="clean_bq">
<div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div>
<p><span>The j-expression is evaluated from within C for each group
(unless they’re optimised with GForce - a new initiative in
data.table). And <code>eval(.SD)</code> or
<code>eval(anything(.SD))</code> is costly.</span></p>
<p><span>You can get around it by listing the columns by yourself
and using <code>.I</code> instead, as follows:</span></p>
<pre><span><code>test[test[, .I[length(unique(list(x1,x2,x3))[[1L]]) > 1L], by=id]$V1]
# 0.140 0.001 0.142
</code>
</span>
</pre>
<p><span>Takes about 0.14 seconds.</span></p>
<hr>
<p><span>An even faster way is:</span></p>
<pre><span><code>system.time({
ans = test[test[, .I[.N > 1], by=id]$V1] # (1)
ans = ans[, .N, by=names(ans)] # (2)
ans = ans[ans[, .I[.N > 1L], by=id]$V1] # (3)
})
# 0.026 0.000 0.027
</code>
</span>
</pre>
<p><span>The idea for the second case is:</span></p>
<p><span>(1) remove all entries where there’s just 1 row
corresponding to that <code>id</code>.<br>
(2) Aggregate this result by all the columns now and get the number
of rows in the column <code>N</code> (we won’t have to use this
column though).<br>
(3) Now, if we aggregate by <code>id</code> and if any id has just
1 row, then it’d mean that that <code>id</code> has had more than 1
rows (step (1) filtering ensures this), but all of them are same
and we don’t need them. So we just filter for those where .N >
1L.</span></p>
<p><span>HTH</span></p>
<div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">
<span><br></span></div>
<div id="bloop_sign_1402709866978106112" class="bloop_sign">
<div style="font-family:helvetica,arial;font-size:13px">
<span>Arun</span></div>
</div>
<div style="color:black"><span><br>
From: <span style="color:black">Ron Hylton</span> <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a><br>
Reply: <span style="color:black">Ron Hylton</span> <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a><br>
Date: <span style="color:black">June 14, 2014 at 3:30:55
AM</span><br>
To: <span style="color:black">datatable-help@lists.r-forge.r-project.org</span>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
Subject: <span style="color:black">Re: [datatable-help]
data.table is asking for help<br></span></span></div>
<br>
<blockquote type="cite" class="clean_bq">
<div lang="EN-US" link="blue" vlink="purple" xml:lang="EN-US">
<div><span><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></span>
<div class="WordSection1">
<p class="MsoNormal"><span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
The performance is what puzzles me; the results are correct so the
warnings don’t matter, and not all the variations I’ve tried have
warnings. On the real dataset (~800,000 rows) datatable takes
about 1.5 times longer than dataframe + ddply. I expected it
to be substantially faster.</span></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
</span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">
From:</span></b> <span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">
Arunkumar Srinivasan [mailto:aragorn168b@gmail.com]<br>
<b>Sent:</b> Friday, June 13, 2014 8:57 PM<br>
<b>To:</b> Ron Hylton;
datatable-help@lists.r-forge.r-project.org<br>
<b>Subject:</b> Re: [datatable-help] data.table is asking for
help</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div id="bloop_customfont">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
However there’s another aspect. While I’m relatively new to R
my understanding is that a function argument should be modifiable
within the function body without affecting the caller, which
perhaps conflicts with the behavior of .SD.</span></p>
</div>
</div>
</blockquote>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Helvetica","sans-serif"">`data.table`
is designed for working with *really large* data sets in
mind (> 100 or 200 GB in memory even). And therefore, as a
design feature, it trades in "referential transparency" for
manipulating data objects *as efficient as possible* in terms of
both *speed* and *memory usage* (most of the times they go
hand-in-hand).</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Helvetica","sans-serif"">This is
perhaps the biggest design choice one needs to be aware of when
working/choosing data.tables. It is possible to modify objects by
reference using data.table - All the functions that begin with
"set*" modify objects by reference. The only other non "set*"
function is `:=` operator.</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Helvetica","sans-serif"">HTH</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif";color:black">
Arun</span></p>
</div>
</div>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif";color:black">
<br>
From: Ron Hylton <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a><br>
Reply: Ron Hylton <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a><br>
Date: June 14, 2014 at 2:52:04 AM<br>
To: <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
Subject: Re: [datatable-help] data.table is asking for
help</span></p>
</div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
<br>
<br></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
I suspected it was something like this. As one clarification,
there is a setkey(test,id) before any setkey(.SD). If
setkey(test,id) is changed to setkey(test) so all columns are in
the original datatable key then the warning goes away.</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
However there’s another aspect. While I’m relatively new to R
my understanding is that a function argument should be modifiable
within the function body without affecting the caller, which
perhaps conflicts with the behavior of .SD.</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
</span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
<b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">
From:</span></b> <span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">
Arunkumar Srinivasan [<a href="mailto:aragorn168b@gmail.com">mailto:aragorn168b@gmail.com</a>]<br>
<b>Sent:</b> Friday, June 13, 2014 8:23 PM<br>
<b>To:</b> Ron Hylton; <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<b>Subject:</b> Re: [datatable-help] data.table is asking for
help</span></p>
</div>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
Nicely reproducible post. Reproducible in v1.9.3 (latest commit) as
well.</span></p>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
This is a tricky one. It happens because you’re setting key
on</span> <code><span style="font-size:10.0pt">.SD</span></code>
<span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
which should normally not be allowed. What happens is, when you set
key the first time, there’s no key set (here) and therefore key is
set on all the columns</span> <code><span style="font-size:10.0pt">x1</span></code><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">,</span>
<code><span style="font-size:10.0pt">x2</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
and</span> <code><span style="font-size:10.0pt">x3</span></code><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">.</span></p>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
Now, the next group (in the</span> <code><span style="font-size:10.0pt">by=.</span></code><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">)
is passed to your function, it’ll have the</span>
<code><span style="font-size:10.0pt">key</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
already set to</span> <code><span style="font-size:10.0pt">x1,x2,x3</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
(because</span> <code><span style="font-size:10.0pt">setkey</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
modifies the object by reference), but</span> <code><span style="font-size:10.0pt">.SD</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
has obtained <strong><span style="font-family:"Helvetica","sans-serif"">new</span></strong>
data corresponding to <em><span style="font-family:"Helvetica","sans-serif"">this</span></em>
group. And</span> <code><span style="font-size:10.0pt">data.table</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
sorts this data, knowing that it already has key set.. but if the
key is set then the order must be 1:n. But it wouldn’t be, as this
data isn’t sorted.</span> <code><span style="font-size:10.0pt">data.table</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
warns in those scenarios.. and that’s why you get the
warning.</span></p>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
To verify this, you can try:</span></p>
<div style="border:solid #CCCCCC 1.0pt;padding:3.0pt 6.0pt 3.0pt 6.0pt">
<pre style="background:#F8F8F8"><code>conflictsTable1 <- function(f, address) {</code>
</pre>
<pre style="background:#F8F8F8"><code> u <- unique(setkey(f))</code>
</pre>
<pre style="background:#F8F8F8"><code> setattr(f, 'sorted', NULL)</code>
</pre>
<pre style="background:#F8F8F8"><code> if (nrow(u) == 1) return(NULL)</code>
</pre>
<pre style="background:#F8F8F8"><code> u</code>
</pre>
<pre style="background:#F8F8F8"><code>}</code>
</pre></div>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
Basically, we set the key of</span> <code><span style="font-size:10.0pt">f</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
(which is equal to</span> <code><span style="font-size:10.0pt">.SD</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
as it’s only modified by reference) to</span> <code><span style="font-size:10.0pt">NULL</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
everytime after.. so that</span> <code><span style="font-size:10.0pt">.SD</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
for the new group will not have the key set.</span></p>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
The ideal scenario here, IIUC, is that</span> <code><span style="font-size:10.0pt">setkey(.SD)</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
or things pointing to</span> <code><span style="font-size:10.0pt">.SD</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
should not be possible (locking binding doesn’t seem to affect
things done by reference..).</span> <code><span style="font-size:10.0pt">.SD</span></code> <span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
however should retain the key of the data.table, if a key was set,
wherever possible.</span></p>
<div id="bloop_customfont">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
</span></p>
</div>
<div id="bloop_sign_1402704505278157056">
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
Arun</span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="color:black"><br>
From: Ron Hylton <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a><br>
Reply: Ron Hylton <a href="mailto:rhylton@verizon.net">rhylton@verizon.net</a><br>
Date: June 14, 2014 at 1:55:53 AM<br>
To: <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
Subject: [datatable-help] data.table is asking for
help</span></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:12.0pt"> </p>
<blockquote style="margin-left:0in;margin-top:11.25pt;margin-right:0in;margin-bottom:11.25pt">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">The code below
generates the warning:</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;word-break:break-all">
<span style="font-size:10.0pt;font-family:"Lucida Console";color:black;background:#E1E2E5">
In setkeyv(x, cols, verbose = verbose) :</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;word-break:break-all">
<span style="font-size:10.0pt;font-family:"Lucida Console";color:black;background:#E1E2E5">
Already keyed by this key but had invalid row order, key
rebuilt. If you didn't go under the hood please let datatable-help
know so the root cause can be fixed.</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;word-break:break-all">
<span style="font-size:10.0pt;font-family:"Lucida Console";color:black;background:#E1E2E5">
</span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">This is my
first attempt at using datatable so I probably did something dumb,
but maybe that‘s useful for someone. The first case is the
one that gives the warnings.</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I’m also
surprised at the timings. I wrote the original algorithm
using dataframe & ddply and I expected datatable to be
substantially faster; the opposite is true.</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">The algorithm
does the following: Certain columns in the table are keys and
others are values in the sense that each row with the same set of
keys should have the same set of values. Find all the key
sets for which this is not true and return the keys sets +
conflicting value sets.</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Insight into
the performance would be appreciated.</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Regards,</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Ron</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
library(data.table)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
library(plyr)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
conflictsTable1 <- function(f) {</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> u <-
unique(setkey(f))</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> if
(nrow(u) == 1) return(NULL)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> u</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">}</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
conflictsTable2 <- function(f) {</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> u <-
unique(f)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> if
(nrow(u) == 1) return(NULL)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> u</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">}</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">conflictsFrame
<- function(f) {</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> u <-
unique(f)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> if
(nrow(u) == 1) return(NULL)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> u</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">}</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">N <-
10000</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">test <-
data.table(id=as.character(10000*sample(1:N,N,replace=TRUE)),
x1=rnorm(N), x2=rnorm(N), x3=rnorm(N))</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
setkey(test,id)</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
print(system.time(ut1 <- test[, conflictsTable1(.SD),
by=id]))</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
print(system.time(ut2 <- test[, conflictsTable2(.SD),
by=id]))</p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> </p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
print(system.time(uf <- ddply(test, .(id), conflictsFrame)))</p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></span></p>
</div>
</div>
</blockquote>
</div>
_______________________________________________<br>
datatable-help mailing list<br>
datatable-help@lists.r-forge.r-project.org<br>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div></div></span></blockquote><p></p></body></html>