[datatable-help] Problem with FAQ 2.8

Michael Nelson michael.nelson at sydney.edu.au
Fri Jun 7 05:50:49 CEST 2013


This is related to 

FR 2693

https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2693&group_id=240&atid=978

What is happening is that the `join` columns must be referenced using their names as defined in `i` (or Y in X[Y] syntax) 

The FAQ doesn't explicitly  cover how you are supposed to reference the columns used in the join.

Perhaps some binding magic could be used to ensure that either column name could be used. I don't think it is useful want both to be defined and available as separate objects - -that would mean there were two copies of something that are identical in value (but not name!)


________________________________________
From: datatable-help-bounces at lists.r-forge.r-project.org [datatable-help-bounces at lists.r-forge.r-project.org] on behalf of Gabor Grothendieck [ggrothendieck at gmail.com]
Sent: Friday, 7 June 2013 11:22 AM
To: datatable-help at lists.r-forge.r-project.org
Subject: [datatable-help] Problem with FAQ 2.8

FAQ 2.8 says:
2.8 What are the scoping rules for j expressions?
Think of the subset as an environment where all the column names are
variables. When a variable
foo is used in the j of a query such as X[Y,sum(foo)], foo is looked
for in the following order :
1. The scope of X's subset; i.e., X's column names.
2. The scope of each row of Y; i.e., Y's column names (join inherited scope)
...
but consider the following (which is modified from this example:
https://r-forge.r-project.org/tracker/?func=detail&atid=975&aid=1663&group_id=240):

> d1 <- data.table(id1 = c(1L, 2L, 2L, 3L), val = 1:4, key = "id1")
> d2 <- data.table(id2 = c(1L, 2L, 4L), val2 = c(11, 12, 14),key = "id2")
>
> d1[d2, sum(id2 * val)]
   id1 V1
1:   1  1
2:   2 10
3:   4 NA
>
> d1[d2, sum(id1 * val)]
Error in `[.data.table`(d1, d2, sum(id1 * val)) : object 'id1' not found

Note that column id1 of d1 is not in scope contrary to point 1.

Even stranger is that d1[, id1] works but d1[d2, id1] does not.

Is the FAQ describing how its supposed to work and the actual behavior
is wrong or is the behavior as intended and the FAQ wrong?



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


More information about the datatable-help mailing list