[datatable-help] Joins with or-condition

Wolfgang Steitz wsteitz at gmail.com
Thu Jun 6 10:19:56 CEST 2013


Is it possible to do something like that with data.table?
    SELECT sum(a.abc)
    FROM table_a as a
    LEFT JOIN table_b as b
        ON a.x = b.x AND (a.y=b.y or a.y=b.z or a.z=b.y or a.z=b.z)
    Group by a.xyz

So my problem is to translate those or-conditions in the ON into
data.table syntax. I tried to do a cartesion join first and then
filter with the or-conditions and then do the groupby, which gives me
the desired results. But since I have large datasets (table_a and
table_b both have more than 1 million rows), I want to avoid the
cartesion join.

Any ideas how to do this in a more clever way?

Thanks,
Wolfgang


More information about the datatable-help mailing list