[datatable-help] convenience function for transforming variables, and adding them to the table

Short, Tom TShort at epri.com
Thu May 27 21:02:01 CEST 2010


> Sasha Goodman wrote:
>
> I'm trying to make a simple convenience function for the
> following common procedure, where one variable is transformed
> with an arbitrary function and merged as a variable to the table:
>  
> dt <- data.table(A = rep(1:3, each=4), B = rep(1:4, each=3), C =
rep(1:2, 6))
> dt[, transform(.SD,D=mean(A)), by="B"]
>  
> Here is my first attempt, but it won't run because of scoping issues. 
>  
> dt.groupby <- function(data,grouping, ...) {
> data[, transform(.SD,expr=...), by=grouping]
> }
>  

Tough problem. The best I could do was:

dt.groupby <- function(data,grouping, ...) {
    eval(bquote(data[, transform(.SD, ...),
                     by = .(substitute(grouping))]))
}

It relies on some funky language manipulation that I always have
a tough time with. It also fails for multiple groupings:

dt.groupby(dt, B, D = mean(A), E = median(A))
dt.groupby(dt, "B", D = mean(A), E = median(A))
dt.groupby(dt, list(B,C), D = mean(A), E = median(A)) # fails



> Any suggestions? It would also be nice if duplicate columns were
> not created, such as the "B.1" the first procedure adds.

Yes, it would. We probably don't want to take those columns out of
.SD because they might be useful. I'm not sure how to get transform
to ignore them. 

- Tom

 


More information about the datatable-help mailing list