[datatable-help] unkey when I use rbind and/or warn when I try a broken key
Frank Erickson
FErickson at psu.edu
Sun Oct 13 05:20:22 CEST 2013
So, I recently did something like this:
DT <- data.table(name=c('Guff','Aw'),id=101:102,id2=1:2,key='id')
y <- rbind(list('No','NON',0L),DT,list('Extra','XTR',3L))
x <- data.table(id=as.character(101:102),z=1:2,key='id')
Those rows I added on do not belong in the positions I pasted them into, so
when I tried...
options(datatable.verbose=TRUE)
x[y,newcol:=name]
...it failed, silently.
I'm guessing it saw the invalid key column in y and then proceeded to merge
by y's column order instead. Because "name" comes before "id" (the column I
thought was my key), no matches are found and newcol is not created. This
is very, very confusing to see. Even with verbose on, I see no mention of
"assigned to zero rows of x" or "matched on zero groups in y".
I've got several problems with how this worked:
(1) y should not inherit DT's key when I rbind it, or I should get a
warning when rbinding a keyed data.table suggesting a better approach (that
I clearly do not know about yet...?).
(2) I really don't like the silent failure to assign to or create newcol.
Warnings are nice.
(3) It failed because DT1 had an invalid key (i.e., a "sorted" attribute on
which it is not actually sorted). When I merge DT2[DT1] and it is found
that DT1's key is invalid, I'd like to see (3a) a warning and (3b) it tell
me explicitly that its merging on column order instead.
Note that there's a nice warning message when I reset the key:
setkey(y,id)
# Warning message:
# In setkeyv(x, cols, verbose = verbose) :
# Already keyed by this key but had invalid row order, key rebuilt. If
you didn't go under the hood please let datatable-help know so the root
cause can be fixed.
What do you all think? Also, is there a right or safe way to do rbinding?
Thanks,
Frank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131012/e495ccd1/attachment.html>
More information about the datatable-help
mailing list