[datatable-help] Return Select/Join that does NOT match?
Branson Owen
branson.owen at gmail.com
Wed Jul 28 05:32:26 CEST 2010
To everyone who contributed to data.table,
I think data.table is the greatest package I have used. It not only
provides the convenience, but also change the way I program and think!
Can't wait to see new version with more power coming! I found many
wishes we desired had been implemented. Want to say a million thanks
to this community before I ask questions.
Here are questions relevant to join/select:
[1] Is that possible to return select/join that does NOT match? This
is easy when using logic index like ! (x==1) but we back to scan and
lost binary search benefits. Not sure about the syntax? Maybe we can
try something like "invert = TRUE" in grep function?
DataTable[ CJ("exclude") , invert = TRUE]
At this moment, I wonder whether
DataTable[ CJ( unique(column) %NOT IN% "exclude" ) ] ** %NOT IN%
is a customized function that returns unselected items
is faster than scan? [column != "exclude"]
[2] Assume I have a DataTable with four keys. How can I efficiently
select/join and skip the first two keys in my join?
This is what I am doing now:
DataTable[ CJ( unique(key1), unique(key2), "target key3", "a
collection of target key4") ]
Am I not supposed to use join like this? Could CJ(...) create a big
object that is comparable to original datatable? Original datatable
might already reach the limit of memory. Should I just use scan in
this case (I hope not)?
[3] I thought I can do this:
DataTable[ CJ( FN(key1), FN(key2), FN(key3) ) ], but it complains
about column names.
*FN is a function
Later I found I can do this, DataTable[ { CJ( FN(key1), FN(key2),
FN(key3) ) } ],
I just add { } outside CJ
Don't understand why, but at least it works. I really wonder whether I
should do this or there is a more correct syntax?
Again, thank you very, very much for all your efforts. Your work is
fantastic and impacting! I seriously believe data.table should be a
standard data type to replace data.frame!
Best regards,
More information about the datatable-help
mailing list