[datatable-help] join example from faq

Matthew Dowle mdowle at mdowle.plus.com
Thu Jun 7 17:19:16 CEST 2012


> Hi Matthew,
>
> Thanks for your response.
>
> I'm following things right up to:
>
>>  "Advanced: In the X[Y,j] form of grouping, the j expression sees
>> variables in X first, then Y. We call this join inherited scope. If the
>> variable is not in X or Y then the calling frame is searched, its
>> calling frame, and so on in the usual way up to and including the global
>> environment."
>
> I'm mainly confused about
>
> X[Y]
>
> In the example Y does not have key. Y's column names are also
> different.  So how did data.table
> know to use column "V1" to match with X's key.

   " An equi-join is performed between each column in i to each column in
x's key ".

So in other words, when i has no key it's the first column of i to the
first column of x's key, the 2nd column of i to the 2nd column of x's key,
etc.

> Also, if the sentence
> above applies how does X[Y j] relate to this example
> in which I have used X[Y].

Sorry, probably too much too soon. I saw your example where X had foo and
Y had bar, and I remembered the X[Y,foo*bar] example of combining JIS with
grouping by i, which is particularly efficient. It perhaps isn't related.
X[Y] is a bit like merge() in that it returns all the columns from X and Y
whether you actually need them or not in later steps. There's a FAQ that
encourages X[Y,j] since that inspects j to see which columns are used, and
only subsets/groups those columns, and join inherited columns (i.e.
columns in Y (not X) that are used by j) that are needed.

> And yes, without data.table, I would have been in trouble for my
> current project. I had
> to repeat the aggregation 20 or so times. It would have been ugly.
>
> Thanks,
>
> Juliet
>




More information about the datatable-help mailing list