[datatable-help] Rolling Join With Groups in One Table

Bernstein, Elliot J EJBernstein at wellington.com
Thu Feb 15 22:32:15 CET 2018


Frank –

Thank you very much. Those notes are extremely helpful.

- Elliot

From: Frank Erickson [mailto:by.hook.or at gmail.com]
Sent: Wednesday, February 14, 2018 4:24 PM
To: Bernstein, Elliot J
Cc: datatable-help at lists.r-forge.r-project.org
Subject: Re: [datatable-help] Rolling Join With Groups in One Table

Np. Those are documented in ?data.table and ?.SD, but it's hard to put the pieces together.

This is an "update join". The developers have a vignette on joins planned, but in the meantime, maybe my notes on it can help: http://franknarf1.github.io/r-tutorial/_book/tables.html#dt-joins

On Wed, Feb 14, 2018 at 4:01 PM, Bernstein, Elliot J <EJBernstein at wellington.com<mailto:EJBernstein at wellington.com>> wrote:
Frank –

Thank you very much for your help.

That works, but unfortunately I don’t understand why. I’ve been searching the package help and vignettes, but haven’t found an explanation. Is this syntax documented somewhere? (Nearly everything about that line is a mystery to me: What does it mean to use .SD in the i argument to DT[]? What is “on = .(date) doing? And how is “x.y” apparently referring to column “y” of y.dt?)

Thanks.

- Elliot

From: Frank Erickson [mailto:by.hook.or at gmail.com<mailto:by.hook.or at gmail.com>]
Sent: Wednesday, February 14, 2018 3:27 PM
To: Bernstein, Elliot J
Cc: datatable-help at lists.r-forge.r-project.org<mailto:datatable-help at lists.r-forge.r-project.org>
Subject: Re: [datatable-help] Rolling Join With Groups in One Table

You mean

x.dt[, y := y.dt[.SD, on=.(date), roll=TRUE, x.y]]

?

On Wed, Feb 14, 2018 at 3:18 PM, Bernstein, Elliot J <EJBernstein at wellington.com<mailto:EJBernstein at wellington.com>> wrote:
How do I execute a rolling join when one table has groups, and the other does not? For example:

x.dt <- as.data.table(
  expand.grid(date = seq(as.Date("2017-01-01"), as.Date("2017-12-31"), "days"),
              group = c("A", "B")
  )
)
x.dt[, x := rnorm(.N)]
setkey(x.dt, group, date)

y.dt <- data.table(date = seq(as.Date("2017-01-01"), as.Date("2017-12-31"), "months"))
y.dt[, y := month(date)]
setkey(y.dt, date)

result <- y.dt[x.dt, roll = TRUE]

The last line fails because “group” is part of the key in x.dt, but not y.dt:

Error in bmerge(i, x, leftcols, rightcols, io, xo, roll, rollends, nomatch,  :
  typeof x.date (double) != typeof i.group (integer)

I would like the value of y in each row of the result to be the month number (from y.dt), regardless of the value of the group column.

Thanks.

- Elliot


_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org<mailto:datatable-help at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20180215/8566f698/attachment.html>


More information about the datatable-help mailing list