[datatable-help] merging syntax

Eduard Antonyan eduard.antonyan at gmail.com
Tue Jun 18 17:34:19 CEST 2013


Frank has already answered why you're getting the results you are.

Re "how do I tell DT to do *separate matchings at each join*?":

Currently you have to use 'merge' or 'data.table', see this -
https://r-forge.r-project.org/tracker/?func=detail&atid=978&aid=4675&group_id=240
and
the SO link inside.


On Tue, Jun 18, 2013 at 10:08 AM, Ivan Alves <papucho at me.com> wrote:

> Hi Frank,
>
> Many thanks for the thoughts.
>
> It is something that has to do with the keys, that is for sure (see
> below). A double DT join does not understand that it has to do one join by
> one variable and the other by another variable: the output simply has lines
> where the two keys are the same (ctpty_head==iss_head),which is of course
> not optimal.
>
> how do I tell DT to do *separate matchings at each join*?  Setting setkey(
> links,ctpty_head,iss_head) before the join does not work either.
>
> > key(g_ctpty[links])
> NULL
> > key(g_ctpty)
> [1] "ctpty_head"
> > key(g_iss)
> [1] "iss_head"
> > key(links)
> NULL
>
> Hi Arunkumar,
>
> An example would look like follows:
>
> g_ctpty = data.table(ctpty_head=c("a","b","c"),
> ctpty_cty=c("US","DE","JP"))
> g_iss = data.table(iss_head=c("a","b","c"), iss_cty=c("US","DE","JP"))
> links = data.table(ctpty_head=c("a","b","c"), iss_head=c("b","b","a"))
>
> merge(
> +     merge(links, g_ctpty, all.x = TRUE, by = "ctpty_head"),
> +     g_iss,
> +     all.x = TRUE, by = "iss_head"
> +   )
>    iss_head ctpty_head ctpty_cty iss_cty
> 1:        a          c        JP      US
> 2:        b          a        US      DE
> 3:        b          b        DE      DE
> > g_iss[g_ctpty[links]]
>    iss_head iss_cty ctpty_cty iss_head.1
> 1:        a      US        US          b
> 2:        b      DE        DE          b
> 3:        c      JP        JP          a
> > setkey(links,ctpty_head,iss_head)
> > g_iss[g_ctpty[links]]
>    iss_head iss_cty ctpty_cty iss_head.1
> 1:        a      US        US          b
> 2:        b      DE        DE          b
> 3:        c      JP        JP          a
>
> On 17 Jun 2013, at 14:36, Frank Erickson <FErickson at psu.edu> wrote:
>
> I think that key(g_ctpty[links]) == key(g_ctpty) == "ctpty_head", which is
> used when you do your second merge with [, instead of "iss_head". You can
> check this by running key(g_ctpty[links]) . --Frank
>
>
> On Mon, Jun 17, 2013 at 1:57 AM, Arunkumar Srinivasan <
> aragorn168b at gmail.com> wrote:
>
>>  Since you have the data as well, why not provide it (a small part at
>> least with which your issue is reproducible)? Isn't it much easier than to
>> ask everyone who's willing to help to create a data and test your code?
>>
>> Arun
>>
>> On Monday, June 17, 2013 at 8:54 AM, Ivan Alves wrote:
>>
>> Dear all,
>>
>> I am not sure I understand the syntax for merging data.tables. I have
>> keyed the two 'satelite' tables from which I want to match information to
>> the main table 'links'
>>
>> g_ctpty <- gultimate[,list(ctpty_head,ctpty_cty)]
>> setkey(g_ctpty,ctpty_head)
>> g_iss <- gultimate[,list(iss_head,iss_cty)]
>> setkey(g_iss,iss_head)
>>
>> Why are the two below not equivalent?
>>
>> This works:
>>
>> data = merge(
>> merge(links, g_ctpty, all.x = TRUE, by = "ctpty_head"),
>> g_iss,
>> all.x = TRUE, by = "iss_head"
>> ),
>>
>> And this does not:
>>
>> data = g_iss[g_ctpty[links]],
>>
>> Any guidance would be appreciated.
>> Kind regards,
>> Ivan
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130618/e4cadcc8/attachment-0001.html>


More information about the datatable-help mailing list