[datatable-help] Cannot access cols of y when doing x[y, ...]

Prasad Chalasani pchalasani at gmail.com
Sun Jan 23 01:57:28 CET 2011


Thank you for clarifying this! I was getting stuck on this. Yes the syntax
and speedup are very nice indeed. I am dealing with 10 million row tables so
it is very useful for me.

Incidentally, I also asked this question on StackOverflow, and updated it
based on your reply
http://stackoverflow.com/questions/4764434/r-when-using-data-table-how-do-i-get-columns-of-y-when-i-do-xy



On Sat, Jan 22, 2011 at 4:48 PM, Matthew Dowle <mdowle at mdowle.plus.com>wrote:

> Welcome to the list.
> You're right, the FAQ is wrong.
> FR#1095 is "Turn back on 'join inherited scope'".
> This was a known problem in NEWS at v1.4 and still is.
> When the grouping code was moved from R into C in v1.4 that feature
> wasn't something that made it into the port.
>
> Glad you appreciate the neater syntax and yes it should be faster (the
> more columns in x and y the faster the speed up could be, over a merge
> followed by a query).
>
> I'll try and take a look soon.
>
> Matthew
>
>
> On Sat, 2011-01-22 at 10:41 -0500, Prasad Chalasani wrote:
> > The Data-table FAQ 1.11 states:
> >
> >
> > "When you write x[y,foo*boo], data.table automatically inspects the j
> > expression to see which columns it uses.
> > It will only subset, or group, those columns only. Memory is only
> > created for the columns the j uses.
> >
> > Let’s say foo is in x, and boo is in y (along with 20 other columns in
> > y).
> >
> > Isn’t x[y,foo*boo] quicker to program and quicker to run than a merge
> > step followed by another subset step ?"
> >
> >
> > Contrary to what it says above, I get an error when I try to access a
> > y-column in the "j" argument of x[y,j].
> >
> > See the sequence of code below.
> >
> >
> > > x <- data.table( foo = c(1,1,1,2,2,3), a = 1:6, key = 'foo')
> >
> >
> > > y <- data.table( foo = c(1,2), boo = 10:11, key = 'foo')
> >
> >
> >
> > # the below works as expected
> >
> > > x[y]
> >
> >      foo a
> >
> > [1,]   1 1
> >
> > [2,]   2 4
> >
> >
> > > with( merge(x,y), foo*boo)
> >
> > [1] 10 10 10 22 22
> >
> >
> > # I want to acheive the same result as the above using the
> >
> > # syntactically more compact (and faster?) code below:
> >
> >
> > > x[y, foo * boo ]
> >
> > Error in eval(expr, envir, enclos) : object 'boo' not found
> >
> >
> > So is the FAQ just wrong, or am I misunderstanding something?
> >
> >
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20110122/a4ab65ce/attachment.htm>


More information about the datatable-help mailing list