[datatable-help] Should by=character(0) just perform as if no bywas called?

Matthew Dowle mdowle at mdowle.plus.com
Wed Sep 21 15:01:00 CEST 2011


Yes, agreed.  Please add as a bug report.

> DT = data.table(a=1:2,b=1:6)
> DT[,sum(b),by=NULL]
Error in bysubl[[1]] : subscript out of bounds

should be same as :

> DT[,sum(b)]
[1] 21

Thanks,
Matthew

"Chris Neff" <caneff at gmail.com> wrote in message 
news:CAAuY0RUt2y0S87N5NOyeV_syJ3-5pYZni3cXd6UpeEhchasmLA at mail.gmail.com...
>I sometimes in my code have multiple possible inputs into a function,
> with some of them having more key columns than other inputs.  I will
> then often times in my code create aggregations over some of those key
> columns.  However, this has wound up with the case where sometimes I
> try to aggregate over no columns. An example:
>
> Suppose I have two data frames, one with "t" and "g" as keys and
> another with just "t" as a key.
>
> DT1 <- data.table( t=rep(1:10,10), g=rep(1:5, 20), x=rnorm(100),
> y=rnorm(100), key=c("t","g"))
> DT2 <- data.table( t=rep(1:10,10), x=rnorm(100), y=rnorm(100), key="t")
>
>
> Now I make a function that aggregates all values of "t" together
>
> F <- function(dt) {
> dt[, list(x=sum(x), y=sum(y)), by=setdiff(key(dt), "t")]
> }
>
>
>> F(DT1)
>     g         x         y
> [1,] 1 -1.829979 -3.320561
> [2,] 2 -4.822312  5.136586
> [3,] 3  6.326729  2.298288
> [4,] 4  4.226714  3.267511
> [5,] 5 -3.277370 -3.474824
>
>> F(DT2)
> Error in l[[1]] : subscript out of bounds
>
>
> I would like F(DT2) to work. It should be the same as calling dt[,
> list(x=sum(x), y=sum(y))].  Do you see any pathologic cases where that
> wouldn't make sense as the default? As it is now, I have to check
> every time if setdiff(key(dt), "t")  is empty, and then do a dt call
> without the by. This is messy and encourages too much copy paste
> errors.
>
> Thanks!
> Chris 





More information about the datatable-help mailing list