[datatable-help] rbindlist and unique

Gabor Grothendieck ggrothendieck at gmail.com
Wed May 21 02:50:54 CEST 2014


On Tue, May 20, 2014 at 8:45 PM, Nathaniel Graham <npgraham1 at gmail.com> wrote:
> Thanks!  That's a good idea, and a lot simpler than what I was concocting in
> my head.  I'll give that a try.  I think--just for for posterity--you mean
>
> DT[, importance := 0 - is.na(V3)]
>
> rather than 0 + is.na(V3), so that rows with V3 are lower than rows without.

0 + is.na(V3) was intended.  We want the good rows to have a lower
importance than the bad rows so 0+is.na(V3)  gives a non-NA V3 an
importance of 0 and it gives a V3 which is NA an importance of 1.
When we sort them using setkey the non-NA of 0 comes first so it is
the one picked by unique.

> DT[, importance := 0+is.na(V3)]
> setkey(DT, V1, V2, importance)
> unique(DT, by = c("V1", "V2"))
   V1 V2   V3 importance
1:  1  3 TRUE          0
2:  1  4 TRUE          0
3:  2  3 TRUE          0
4:  2  4 TRUE          0
5:  3  1   NA          1


More information about the datatable-help mailing list