<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<html><body>
<p> </p>
<p>Agreed, new bug. Thanks for reporting. If you could please file on the R-Forge tracker (then you'll get auto updates) or I can file it, don't mind.</p>
<p>I will get to the bug list eventually!</p>
<p>Thanks, Matthew</p>
<p> </p>
<p>On 10.04.2013 15:46, Shir Levkowitz wrote:</p>
<blockquote type="cite" style="padding-left:5px; border-left:#1010ff 2px solid; margin-left:5px; width:100%"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->I have encountered a bug in the Cartesian join of two data.tables, where the resulting data.table is not sorted by its full key. This is in data.table v1.8.8. Please let me know if this issue has been brought up or if there is any insight regarding it.
<div>Thank you,</div>
<div>Shir Levkowitz<br />
<div>-------------------------------------------------</div>
<div>
<div><span style="font-family: Courier;">library(data.table)</span></div>
<div><span style="font-family: Courier;"><br /></span></div>
<div><span style="font-family: Courier;">###### set up our example data tables</span></div>
<div><span style="font-family: Courier;">test1 </span></div>
<div><span style="font-family: Courier;"> b=sample(1:3, 100, replace=TRUE),</span></div>
<div><span style="font-family: Courier;"> c=sample(1:10, 100,replace=TRUE))</span></div>
<div><span style="font-family: Courier;">setkey(test1, a,b,c)</span></div>
<div><span style="font-family: Courier;"><br /></span></div>
<div><span style="font-family: Courier;">test2 </span></div>
<div><span style="font-family: Courier;"> q=sample(1:3, 100, replace=TRUE),</span></div>
<div><span style="font-family: Courier;"> r=sample(1:100),</span></div>
<div><span style="font-family: Courier;"> w=sample(1:100))</span></div>
<div><span style="font-family: Courier;">setkey(test2, p,q)</span></div>
</div>
<div><span style="font-family: Courier;"><br /></span></div>
<div><span style="font-family: Courier;"><br /></span></div>
<div>
<div><span style="font-family: Courier;">###### a cartesian join - this is where the issue arises</span></div>
<div><span style="font-family: Courier;">test.join </span></div>
</div>
<div><span style="font-family: Courier;"><br /></span></div>
<div><span style="font-family: Courier;">### have a look at the key</span></div>
<div><span style="font-family: Courier;">k </span></div>
<div><span style="font-family: Courier;">k</span></div>
<div><span style="font-family: Courier;"><br /></span></div>
<div><span style="font-family: Courier;">### if we do a group by, we don't get the right aggregation</span></div>
<div><span style="font-family: Courier;">test.gb </span></div>
<div><span style="font-family: Courier;">test.gb[a == 1 & b == 1 & c == 1,]</span></div>
<div><span style="font-family: Courier;">### when really what we want is:</span></div>
<div><span style="font-family: Courier;">test.agg </span></div>
<div><span style="font-family: Courier;">subset(test.agg, a == 1 & b == 1 & c == 1)</span></div>
<div><span style="font-family: Courier;"><br /></span></div>
<div><span style="font-family: Courier;">### if we set the same key, we get a warning</span></div>
<div><span style="font-family: Courier;">setkeyv(test.join, k)</span></div>
<div><span style="font-family: Courier;">>> Warning message: </span></div>
<div><span style="font-family: Courier;">In setkeyv(test.join, k) : </span><span style="font-family: Courier;">Already keyed by this key but had invalid row order, key rebuilt. If you didn't go under the hood please let datatable-help know so the root cause can be fixed.</span></div>
<div><span style="font-family: Courier;"><br /></span></div>
<div></div>
</div>
</blockquote>
<p> </p>
<div> </div>
</body></html>