[datatable-help] Cannot access cols of y when doing x[y, ...]
Matthew Dowle
mdowle at mdowle.plus.com
Tue Feb 1 23:12:47 CET 2011
Hi Prasad and all,
Join inherited scope is now back on, in v1.5.3 on R-Forge.
FAQ 1.11 has been simplified; see latest pdf on homepage.
Please check the NEWS link for v1.5.3 as there are
several other changes too e.g. X[Y] now includes Y's
non-join columns for consistency with JIS. FAQ 1.12
now strongly encourages X[Y,j] rather than X[Y].
v1.5.3 contains several changes that may require changes
to existing code. Please check and let us know if these
will cause any issues.
http://datatable.r-forge.r-project.org/
Matthew
On Sat, 2011-01-22 at 19:57 -0500, Prasad Chalasani wrote:
> Thank you for clarifying this! I was getting stuck on this. Yes the
> syntax and speedup are very nice indeed. I am dealing with 10 million
> row tables so it is very useful for me.
>
>
> Incidentally, I also asked this question on StackOverflow, and updated
> it based on your reply
> http://stackoverflow.com/questions/4764434/r-when-using-data-table-how-do-i-get-columns-of-y-when-i-do-xy
>
>
>
>
> On Sat, Jan 22, 2011 at 4:48 PM, Matthew Dowle
> <mdowle at mdowle.plus.com> wrote:
> Welcome to the list.
> You're right, the FAQ is wrong.
> FR#1095 is "Turn back on 'join inherited scope'".
> This was a known problem in NEWS at v1.4 and still is.
> When the grouping code was moved from R into C in v1.4 that
> feature
> wasn't something that made it into the port.
>
> Glad you appreciate the neater syntax and yes it should be
> faster (the
> more columns in x and y the faster the speed up could be, over
> a merge
> followed by a query).
>
> I'll try and take a look soon.
>
> Matthew
>
>
>
> On Sat, 2011-01-22 at 10:41 -0500, Prasad Chalasani wrote:
> > The Data-table FAQ 1.11 states:
> >
> >
> > "When you write x[y,foo*boo], data.table automatically
> inspects the j
> > expression to see which columns it uses.
> > It will only subset, or group, those columns only. Memory is
> only
> > created for the columns the j uses.
> >
> > Let’s say foo is in x, and boo is in y (along with 20 other
> columns in
> > y).
> >
> > Isn’t x[y,foo*boo] quicker to program and quicker to run
> than a merge
> > step followed by another subset step ?"
> >
> >
> > Contrary to what it says above, I get an error when I try to
> access a
> > y-column in the "j" argument of x[y,j].
> >
> > See the sequence of code below.
> >
> >
> > > x <- data.table( foo = c(1,1,1,2,2,3), a = 1:6, key =
> 'foo')
> >
> >
> > > y <- data.table( foo = c(1,2), boo = 10:11, key = 'foo')
> >
> >
> >
> > # the below works as expected
> >
> > > x[y]
> >
> > foo a
> >
> > [1,] 1 1
> >
> > [2,] 2 4
> >
> >
> > > with( merge(x,y), foo*boo)
> >
> > [1] 10 10 10 22 22
> >
> >
> > # I want to acheive the same result as the above using the
> >
> > # syntactically more compact (and faster?) code below:
> >
> >
> > > x[y, foo * boo ]
> >
> > Error in eval(expr, envir, enclos) : object 'boo' not found
> >
> >
> > So is the FAQ just wrong, or am I misunderstanding
> something?
> >
> >
>
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>
>
More information about the datatable-help
mailing list