[datatable-help] Unable to have expression for "by" criteria
Matthew Dowle
mdowle at mdowle.plus.com
Tue Jun 29 22:34:15 CEST 2010
Harish,
This bug is fixed, revision 101 just committed. Bug #977 closed, thanks
for raising it. Test 170 added.
Matthew
On Fri, 2010-06-25 at 20:30 +0100, Matthew Dowle wrote:
> Thanks, yes see that too. Could you enter that as bug in the tracker on
> r-forge please now its confirmed. It saves me a bit of time and helps
> not to forget.
> Matthew
>
> On Fri, 2010-06-25 at 11:56 -0700, Harish wrote:
> > Thanks for the fix Matthew.
> >
> > While testing it out, I ran into another related issue.
> >
> > DT <- data.table( a=1:5, b=11:50, d=c("A","B","C","D") )
> > g <- quote( list( d ) )
> > identical( DT[ , list(d) ], DT[ , eval(g) ] ) # Expect TRUE but get FALSE
> >
> > When the quote has a list and I use it for "j", I get a list -- not a data.table -- as a response. If the quote is just a variable, it works as expected and returns a vector.
> >
> >
> > Regards,
> > Harish
> >
> > --- On Wed, 6/23/10, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
> >
> > > From: Matthew Dowle <mdowle at mdowle.plus.com>
> > > Subject: Re: [datatable-help] Unable to have expression for "by" criteria
> > > To: "Harish" <harishv_99 at yahoo.com>
> > > Cc: datatable-help at lists.r-forge.r-project.org
> > > Date: Wednesday, June 23, 2010, 2:53 PM
> > > Those two bugs fixed now, just
> > > committed.
> > >
> > > DT <- data.table( a=1:5, b=11:50, d=c("A","B","C","D")
> > > )
> > > f <- quote( list(d) )
> > > DT[ , mean(b), by=eval(f) ] # works
> > > foo <- function( grp ) {
> > > DT[ , mean(b), by=eval( grp ) ]
> > > }
> > > foo( quote( list(d) ) )
> > > # works, colname in result is d
> > > foo( quote( list(d,a) ) )
> > > # works
> > > foo( quote( list(d,even=a%%2L) ) ) # works
> > >
> > > Also fixed where a colname is called grp, the same name as
> > > the variable
> > > holding the expression. Before, the eval would see
> > > the grp column first
> > > and complain that didn't evaluate to a list. Now when
> > > eval is passed to
> > > by, that eval is done in calling frame, before using the
> > > result within
> > > the frame. So now, this works :
> > >
> > > DT <- data.table( a=1:5, b=11:50, d=c("A","B","C","D"),
> > > f=1:5, grp=1:5 )
> > > DT[,mean(b),by=eval(f)] # works using quote(list(d))
> > > not f column
> > > f = quote(list(grp))
> > > foo(f)
> > > # works, groups by the grp column
> > >
> > > Matthew
> > >
> > > On Mon, 2010-06-21 at 08:10 +0100, Matthew Dowle wrote:
> > > > Thanks for raising this one. Have just committed a fix
> > > for that, latest
> > > > version on r-forge.
> > > >
> > > > DT <- data.table( a=1:5, b=11:50,
> > > d=c("A","B","C","D") )
> > > > f <- quote( list(d) )
> > > > DT[ , mean(b), by=eval(f) ] # worked before
> > > > foo <- function( grp ) {
> > > > DT[ , mean(b), by=eval( grp ) ]
> > > > }
> > > > foo( quote( list(d) ) ) # works now
> > > >
> > > > The column names of the result are 'f' and 'grp'
> > > respectively though,
> > > > rather than d. Bug #974 raised for that.
> > > >
> > > > Multiple expressions in the quoted by don't yet work
> > > :
> > > > > foo( quote( list(d,a) ) )
> > > > Error in bysubl[[jj + 1]] : subscript out of bounds
> > > > >
> > > > Bug #975 raised for that.
> > > >
> > > > Matthew
> > > >
> > > >
> > > > On Fri, 2010-06-18 at 23:15 -0700, Harish wrote:
> > > > > Thanks. The eval() did the trick in the
> > > simplified example.
> > > > >
> > > > > Now I run into another hurdle when I make the
> > > code a little more complex.
> > > > >
> > > > > # ==================
> > > > >
> > > > > DT <- data.table( a=1:5, b=11:50,
> > > d=c("A","B","C","D") )
> > > > > f <- quote( list(d) )
> > > > > DT[ , mean(b), by=eval(f) ] #
> > > Now this works; thanks
> > > > > foo <- function( grp ) {
> > > > > DT[ , mean(b), by=eval( grp ) ]
> > > > > }
> > > > > foo( list(d) ) # Gives an error
> > > > > foo( quote( list(d) ) ) # Also gives the
> > > same error
> > > > >
> > > > > # ==================
> > > > >
> > > > > The error I get is:
> > > > > Error in eval(grp) : object 'grp'
> > > not found
> > > > >
> > > > >
> > > > > Conceptually, it looks like it should work.
> > > > >
> > > > >
> > > > > Regards,
> > > > > Harish
> > > > >
> > > > >
> > > > > --- On Fri, 6/18/10, mdowle at mdowle.plus.com
> > > <mdowle at mdowle.plus.com>
> > > wrote:
> > > > >
> > > > > > From: mdowle at mdowle.plus.com
> > > <mdowle at mdowle.plus.com>
> > > > > > Subject: Re: [datatable-help] Unable to have
> > > expression for "by" criteria
> > > > > > To: "Harish" <harishv_99 at yahoo.com>
> > > > > > Cc: datatable-help at lists.r-forge.r-project.org
> > > > > > Date: Friday, June 18, 2010, 4:50 AM
> > > > > > Try this (works for me) :
> > > > > >
> > > > > > f <- quote( list(d) )
> > > > > > DT[ , mean(b), by=eval(f) ]
> > > > > >
> > > > > > If that works, you were very close, just
> > > needed to use eval
> > > > > > in the by. I
> > > > > > _think_ this makes sense as syntax,
> > > something needs to
> > > > > > signal to the
> > > > > > reader of the query that f is not a column
> > > name but a
> > > > > > pre-defined
> > > > > > expression, for clarity.
> > > > > >
> > > > > > I basically need to have another look at
> > > this and tidy up
> > > > > > the
> > > > > > documentation and examples. Might add a FAQ
> > > on it. There
> > > > > > were big changes
> > > > > > in this area internally when grouping was
> > > sped up e.g. its
> > > > > > very recent
> > > > > > that by can be list(), by used to be just a
> > > character
> > > > > > string.
> > > > > >
> > > > > > You can do the same thing for j btw. Kind of
> > > like a macro.
> > > > > > There might
> > > > > > already be a FAQ on that.
> > > > > >
> > > > > > Its quite neat actually what R allows ... I
> > > don't believe
> > > > > > in SQL you can
> > > > > > as easily create expressions for criteria
> > > (select, group by
> > > > > > and where) and
> > > > > > re-use them like this.
> > > > > >
> > > > > > I'll need an example for #2 as I don't quite
> > > follow
> > > > > > that. Maybe it drops
> > > > > > out of answer above?
> > > > > >
> > > > > > Matthew
> > > > > >
> > > > > >
> > > > > > > I am trying to compute some values in a
> > > data.table by
> > > > > > dynamically
> > > > > > > generating the "by" criteria.
> > > However, I am
> > > > > > unable to figure out how to
> > > > > > > do it. (I had to resort to using
> > > the "plyr"
> > > > > > package.)
> > > > > > >
> > > > > > > Questions:
> > > > > > > 1) Why am I unable to pass a variable
> > > for the "by"
> > > > > > criteria? The comments
> > > > > > > in the code indicate that it should be
> > > possible.
> > > > > > > 2) Assuming the issue is a bug (and
> > > will be fixed),
> > > > > > what is a "good" way
> > > > > > > for me to accomplish dynamically
> > > creating a
> > > > > > criteria? (This is more of a
> > > > > > > generic R question I suppose.)
> > > > > > >
> > > > > > > -----
> > > > > > >
> > > > > > > Question #1 -- Unable to pass variable
> > > for "by"
> > > > > > criteria
> > > > > > >
> > > > > > > The comments in the code state: "The by
> > > expression
> > > > > > also see variables in
> > > > > > > the calling frame, just like j... but
> > > from v1.3 is
> > > > > > e.g. bycriteria =
> > > > > > > quote(list(colA,colB%%100));
> > > DT[...,by=bycriteria]"
> > > > > > >
> > > > > > > Then the following code should work,
> > > but it does not.
> > > > > > >
> > > > > > > DT <- data.table( a=1:5, b=11:50,
> > > > > > d=c("A","B","C","D"))
> > > > > > > DT[ , mean(b), by=d ]
> > >
> > > > > > # Works
> > > > > > > DT[ , mean(b), by=list(d)
> > > ] # Works
> > > > > > > f <- quote( list(d) )
> > > > > > > DT[ , mean(b), by=f ]
> > >
> > > > > > # This does not work
> > > > > > >
> > > > > > > The response is:
> > > > > > > Error in `[.data.table`(DT, , mean(b),
> > > by = f) :
> > > > > > > column 1 of 'by' list
> > > does not
> > > > > > evaluate to integer e.g. the by should be
> > > > > > > a list of expressions. Do not quote
> > > column names when
> > > > > > using
> > > > > > > by=list(...).
> > > > > > >
> > > > > > > What is going on?
> > > > > > >
> > > > > > > -----
> > > > > > >
> > > > > > > Question #2 -- Tips for dynamically
> > > generating
> > > > > > criteria
> > > > > > >
> > > > > > > What are some tips and generally
> > > accepted approaches
> > > > > > (in R) to dynamically
> > > > > > > generate a criteria?
> > > > > > >
> > > > > > > The first step is to generate a list of
> > > columns to
> > > > > > group by. How should I
> > > > > > > structure the function that gathers
> > > this info?
> > > > > > Is it better to get them
> > > > > > > straight as strings, or should I get
> > > the variables
> > > > > > directly and then use
> > > > > > > either match.call() or
> > > deparse(substitute(x)) to
> > > > > > convert to strings? The
> > > > > > > aes() function in ggplot2 uses the
> > > match.call()
> > > > > > approach. Or is the
> > > > > > > conversion to strings even
> > > required? (The plyr
> > > > > > package accepted a vector
> > > > > > > of strings, and a few other formats for
> > > its "by"
> > > > > > criteria.)
> > > > > > >
> > > > > > > The next step is to generate the
> > > "language" or
> > > > > > "symbol" object that I need
> > > > > > > to create. I would appreciate
> > > some guidance on
> > > > > > how I can put together my
> > > > > > > required columns into the by criteria
> > > dynamically.
> > > > > > >
> > > > > > > (I understand that it is a generic R
> > > question, but
> > > > > > since it is so closely
> > > > > > > related to my question #1, I am asking
> > > both of them
> > > > > > here.)
> > > > > > >
> > > > > > > Thanks for your help.
> > > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > > Harish
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > _______________________________________________
> > > > > > > datatable-help mailing list
> > > > > > > datatable-help at lists.r-forge.r-project.org
> > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > datatable-help mailing list
> > > > datatable-help at lists.r-forge.r-project.org
> > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > >
> > >
> > >
> >
> >
> >
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
More information about the datatable-help
mailing list