[datatable-help] Return Select/Join that does NOT match?

Branson Owen branson.owen at gmail.com
Wed Jul 28 05:32:26 CEST 2010


To everyone who contributed to data.table,

I think data.table is the greatest package I have used. It not only
provides the convenience, but also change the way I program and think!
Can't wait to see new version with more power coming! I found many
wishes we desired had been implemented. Want to say a million thanks
to this community before I ask questions.

Here are questions relevant to join/select:

[1] Is that possible to return select/join that does NOT match? This
is easy when using logic index like ! (x==1) but we back to scan and
lost binary search benefits. Not sure about the syntax? Maybe we can
try something like "invert = TRUE" in grep function?

DataTable[ CJ("exclude") , invert = TRUE]

At this moment, I wonder whether

DataTable[ CJ( unique(column) %NOT IN% "exclude"   )  ]   ** %NOT IN%
is a customized function that returns unselected items

is faster than scan?  [column != "exclude"]


[2] Assume I have a DataTable with four keys. How can I efficiently
select/join and skip the first two keys in my join?

This is what I am doing now:

DataTable[ CJ( unique(key1), unique(key2), "target key3", "a
collection of target key4") ]

Am I not supposed to use join like this? Could CJ(...) create a big
object that is comparable to original datatable? Original datatable
might already reach the limit of memory. Should I just use scan in
this case (I hope not)?

[3] I thought I can do this:

DataTable[ CJ( FN(key1), FN(key2), FN(key3) ) ], but it complains
about column names.
*FN is a function

Later I found I can do this, DataTable[ { CJ( FN(key1), FN(key2),
FN(key3) ) } ],
I just add { } outside CJ

Don't understand why, but at least it works. I really wonder whether I
should do this or there is a more correct syntax?


Again, thank you very, very much for all your efforts. Your work is
fantastic and impacting! I seriously believe data.table should be a
standard data type to replace data.frame!

Best regards,


More information about the datatable-help mailing list