[datatable-help] Merge bug in v 1.7.8 and 1.7.9?

DM tb2usd at gmail.com
Tue Jan 31 15:53:00 CET 2012


Good morning,

I found a discrepancy in results from two scripts that were run with the
same data with versions 1.7.6 and 1.7.8 (and now 1.7.9) of data.table.  It
seems that the root issue is differences in `merge.data.table`.

I have two objects dtA and dtB, with columns (i, j, k, A) and (j, k, B),
respectively.  I merge these via:

dtC = merge(dtA, dtB, by = c("k", "j"), all.x = TRUE)

NB: In dtA, many rows have matching values of (j, k) (i.e. these are not
unique per row), while they are unique in dtB.  In addition, there are no
keys assigned to dtA nor dtB, though it seems V1.7.9 creates keys (k,j) for
dtC (haven't yet checked for V1.7.6, but it doesn't bother me).

In V1.7.6, for dtC rows with matching (j,k) entries, the B entries also
match, which is the expected behavior.  In V1.7.8 and V1.7.9, this is no
longer the case.  I am not sure where the B entries come from.

I will attempt to generate a reproducible example, and follow-up.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20120131/5defc7c0/attachment.html>


More information about the datatable-help mailing list