[datatable-help] Cartesian join invalid key order - bug report
Matthew Dowle
mdowle at mdowle.plus.com
Wed Apr 10 17:06:55 CEST 2013
Agreed, new bug. Thanks for reporting. If you could please file on
the R-Forge tracker (then you'll get auto updates) or I can file it,
don't mind.
I will get to the bug list eventually!
Thanks, Matthew
On 10.04.2013 15:46, Shir Levkowitz wrote:
> I have encountered a
bug in the Cartesian join of two data.tables, where the resulting
data.table is not sorted by its full key. This is in data.table v1.8.8.
Please let me know if this issue has been brought up or if there is any
insight regarding it.
> Thank you,
> Shir Levkowitz
>
>
-------------------------------------------------
>
>
library(data.table)
>
> ###### set up our example data tables
> test1
> b=sample(1:3, 100, replace=TRUE),
> c=sample(1:10,
100,replace=TRUE))
> setkey(test1, a,b,c)
>
> test2
> q=sample(1:3,
100, replace=TRUE),
> r=sample(1:100),
> w=sample(1:100))
>
setkey(test2, p,q)
>
> ###### a cartesian join - this is where the
issue arises
> test.join
>
> ### have a look at the key
> k
> k
>
> ### if we do a group by, we don't get the right aggregation
>
test.gb
> test.gb[a == 1 & b == 1 & c == 1,]
> ### when really what we
want is:
> test.agg
> subset(test.agg, a == 1 & b == 1 & c == 1)
>
>
### if we set the same key, we get a warning
> setkeyv(test.join, k)
>>> Warning message:
> In setkeyv(test.join, k) : Already keyed by
this key but had invalid row order, key rebuilt. If you didn't go under
the hood please let datatable-help know so the root cause can be
fixed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130410/fa8289b4/attachment-0001.html>
More information about the datatable-help
mailing list