[datatable-help] eval(eval issue
Matthew Dowle
mdowle at mdowle.plus.com
Sun Aug 14 14:48:41 CEST 2011
Thanks - reproduced. It seems to be the line
tmp <- summary(x[!is.na(x)])
in percent_in_category. summary's return type is different for factors.
The error doesn't seem to be a data.table one, and looks correct. I
downgraded to 1.6.2 and the same error happens there, so I couldn't
reproduce any change in behaviour I'm afraid. Maybe the difference is
higher up/earlier in the real code.
Matthew
On Sun, 2011-08-14 at 06:22 -0500, Damian Betebenner wrote:
> Thanks Matthew,
>
> Updated to 1.6.4 from 1.6.2 and the error started. Using R 2.13.0.
>
> The issue (now somewhat of a non-issue) arises because I was mistakenly passing a character vector instead of a factor. When a factor gets passed there
> is no problem. I'm not completely sure why this was working previously and now there's a problem. It did, however, root out an inconsistency in my data
> that needed fixing :-)
>
>
>
> Here's a example:
>
>
> ### Some toy data
>
> my.factor.levels <- c('Catch Up: Yes', 'Catch Up: No', 'Keep Up: Yes', 'Keep Up: No')
> my.dt1 <- data.table(X= sample(my.factor.levels, 100, replace=TRUE), Y=rep(1:2, each=50), Z=sample(c("M", "F"), 100, replace=TRUE))
> my.dt2 <- my.dt1
> my.dt2$X <- as.character(my.dt2$X)
>
> ### Utility function
>
> percent_in_category <- function (x, in.categories, of.categories, result.digits = 1)
> {
> if (!is.list(in.categories))
> in.categories <- list(in.categories)
> if (!is.list(of.categories))
> of.categories <- list(of.categories)
> tmp.result <- list()
> tmp <- summary(x[!is.na(x)])
> for (i in seq(length(in.categories))) {
> tmp.result[[i]] <- round(100 * sum(tmp[in.categories[[i]]])/sum(tmp[of.categories[[i]]]),
> digits = result.digits)
> }
> return(unlist(tmp.result))
> }
>
>
> ### data.table call (my.dt2) that produces error
>
> my.dt1[,percent_in_category(X,list('Catch Up: Yes'), list(c('Catch Up: Yes', 'Catch Up: No', 'Keep Up: Yes', 'Keep Up: No'))), by=list(Y, Z)]
>
> my.dt2[,percent_in_category(X,list('Catch Up: Yes'), list(c('Catch Up: Yes', 'Catch Up: No', 'Keep Up: Yes', 'Keep Up: No'))), by=list(Y, Z)]
>
>
>
>
>
>
> Damian Betebenner
> Center for Assessment
> PO Box 351
> Dover, NH 03821-0351
>
> Phone (office): (603) 516-7900
> Phone (cell): (857) 234-2474
> Fax: (603) 516-7910
>
> dbetebenner at nciea.org
> www.nciea.org
>
>
>
>
> -----Original Message-----
> From: Matthew Dowle [mailto:mdowlenoreply at virginmedia.com] On Behalf Of Matthew Dowle
> Sent: Sunday, August 14, 2011 5:49 AM
> To: Damian Betebenner
> Cc: datatable-help at lists.r-forge.r-project.org
> Subject: Re: [datatable-help] eval(eval issue
>
> Not sure why there's an extra level of quote() inside the string it's
> parsing; i.e., instead of :
>
> ListExpr <- parse(text=paste("quote(as.list(c(",
> paste(unlist(tmp.sgp.summaries), collapse=", "),")))",sep=""))
>
> ByExpr <- parse(text=paste("quote(list(", paste(sgp.groups.to.summarize,
> collapse=", "), "))", sep=""))
>
> tmp <- tmp.dt[, eval(eval(ListExpr)), by=eval(eval(ByExpr))]
>
>
> try removing the quote() as follows (also changing the as.list to list
> to make a considerable speed improvement as per wiki example 4) :
>
> ListExpr <- parse(text=paste("list(", paste(unlist(tmp.sgp.summaries),
> collapse=", "),")",sep=""))
>
> ByExpr <- parse(text=paste("list(", paste(sgp.groups.to.summarize,
> collapse=", "), "))", sep="")
>
> tmp <- tmp.dt[, eval(ListExpr), by=eval(ByExpr)]
>
>
> If that doesn't fix it then we'll need a reproducible example please
> that we can paste into R. Should be possible to create one in this case
> as it's a type error. Also a confirm that you're using v.1.6.4 and which
> version of R please as this kind of error can be affected by that.
>
> Matthew
>
>
> On Sat, 2011-08-13 at 20:49 -0500, Damian Betebenner wrote:
> > As an update. The issue that seems to be throwing the error is the
> > call to a function “percent_in_category” that has single quoted
> > expressions in the ListExpr:
> >
> >
> >
> > Browse[1]> ListExpr
> >
> > expression(quote(as.list(c(median_na(SGP), median_na(SGP_TARGET),
> > percent_in_category(CATCH_UP_KEEP_UP_STATUS, list(c('Catch Up: Yes',
> > 'Keep Up: Yes')), list(c('Catch Up: Yes', 'Catch Up: No', 'Keep Up:
> > Yes', 'Keep Up: No'))), num_non_missing(SGP),
> > percent_in_category(ACHIEVEMENT_LEVEL, list(c('Proficient',
> > 'Advanced')), list(c('Unsatisfactory', 'Partially Proficient',
> > 'Proficient', 'Advanced'))), num_non_missing(ACHIEVEMENT_LEVEL),
> > percent_in_category(ACHIEVEMENT_LEVEL_PRIOR, list(c('Proficient',
> > 'Advanced')), list(c('Unsatisfactory', 'Partially Proficient',
> > 'Proficient', 'Advanced'))),
> > num_non_missing(ACHIEVEMENT_LEVEL_PRIOR)))))
> >
> >
> >
> > When I isolate in my testing, it gives the same error:
> >
> >
> >
> > Error during wrapup: invalid 'type' (character) of argument
> >
> >
> >
> >
> >
> > Anybody know of a workaround for this or a simplification of the
> > approach I’m taking.
> >
> >
> >
> > thanks,
> >
> >
> >
> > Damian
> >
> >
> >
> >
> >
> >
> >
> > Damian Betebenner
> >
> > Center for Assessment
> >
> > PO Box 351
> >
> > Dover, NH 03821-0351
> >
> >
> >
> > Phone (office): (603) 516-7900
> >
> > Phone (cell): (857) 234-2474
> >
> > Fax: (603) 516-7910
> >
> >
> >
> > dbetebenner at nciea.org
> >
> > www.nciea.org
> >
> >
> >
> >
> >
> >
> >
> >
> > From: Damian Betebenner
> > Sent: Saturday, August 13, 2011 9:05 PM
> > To: 'datatable-help at lists.r-forge.r-project.org'
> > Subject: eval(eval issue
> >
> >
> >
> >
> > All,
> >
> >
> >
> > A line of code that used to work version 1.6.2 is now throwing an
> > error so I thought I’d see if anyone knows exactly what is going on.
> > The error that I now get is:
> >
> >
> >
> > Error during wrapup: invalid 'type' (character) of argument
> >
> >
> >
> > The call that is generating the error is:
> >
> >
> >
> > tmp <- tmp.dt[, eval(eval(ListExpr)), by=eval(eval(ByExpr))]
> >
> >
> >
> >
> >
> > It is essentially a
> >
> >
> >
> > x[,j=ListExpr,by=ByExpr] call but the j is a long of custom functions
> > and the ByExpr is a list of variables in x that changes via a loop.
> >
> >
> >
> > Currently (what is broken), the ListExpr and ByExpr are created using
> > the following:
> >
> >
> >
> > ListExpr <- parse(text=paste("quote(as.list(c(",
> > paste(unlist(tmp.sgp.summaries), collapse=", "),")))",sep=""))
> >
> > ByExpr <- parse(text=paste("quote(list(",
> > paste(sgp.groups.to.summarize, collapse=", "), "))", sep=""))
> >
> >
> >
> > For example, ListExpr is:
> >
> >
> >
> > Browse[1]> ListExpr
> >
> > expression(quote(as.list(c(median_na(SGP), median_na(SGP_TARGET),
> > percent_in_category(CATCH_UP_KEEP_UP_STATUS, list(c('Catch Up: Yes',
> > 'Keep Up: Yes')), list(c('Catch Up: Yes', 'Catch Up: No', 'Keep Up:
> > Yes', 'Keep Up: No'))), num_non_missing(SGP),
> > percent_in_category(ACHIEVEMENT_LEVEL, list(c('Proficient',
> > 'Advanced')), list(c('Unsatisfactory', 'Partially Proficient',
> > 'Proficient', 'Advanced'))), num_non_missing(ACHIEVEMENT_LEVEL),
> > percent_in_category(ACHIEVEMENT_LEVEL_PRIOR, list(c('Proficient',
> > 'Advanced')), list(c('Unsatisfactory', 'Partially Proficient',
> > 'Proficient', 'Advanced'))),
> > num_non_missing(ACHIEVEMENT_LEVEL_PRIOR)))))
> >
> >
> >
> >
> >
> > And ByExpr is:
> >
> >
> >
> > Browse[1]> ByExpr
> >
> > expression(quote(list(STATE, CONTENT_AREA, YEAR, GRADE,
> > STATE_ENROLLMENT_STATUS, CATCH_UP_KEEP_UP_STATUS_INITIAL)))
> >
> >
> >
> >
> >
> > I’m sure this can be simplified.
> >
> >
> >
> > Any help greatly appreciated,
> >
> >
> >
> > Damian
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Damian Betebenner
> >
> > Center for Assessment
> >
> > PO Box 351
> >
> > Dover, NH 03821-0351
> >
> >
> >
> > Phone (office): (603) 516-7900
> >
> > Phone (cell): (857) 234-2474
> >
> > Fax: (603) 516-7910
> >
> >
> >
> > dbetebenner at nciea.org
> >
> > www.nciea.org
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
More information about the datatable-help
mailing list