[datatable-help] Should by=character(0) just perform as if no bywas called?
Matthew Dowle
mdowle at mdowle.plus.com
Wed Sep 21 15:01:00 CEST 2011
Yes, agreed. Please add as a bug report.
> DT = data.table(a=1:2,b=1:6)
> DT[,sum(b),by=NULL]
Error in bysubl[[1]] : subscript out of bounds
should be same as :
> DT[,sum(b)]
[1] 21
Thanks,
Matthew
"Chris Neff" <caneff at gmail.com> wrote in message
news:CAAuY0RUt2y0S87N5NOyeV_syJ3-5pYZni3cXd6UpeEhchasmLA at mail.gmail.com...
>I sometimes in my code have multiple possible inputs into a function,
> with some of them having more key columns than other inputs. I will
> then often times in my code create aggregations over some of those key
> columns. However, this has wound up with the case where sometimes I
> try to aggregate over no columns. An example:
>
> Suppose I have two data frames, one with "t" and "g" as keys and
> another with just "t" as a key.
>
> DT1 <- data.table( t=rep(1:10,10), g=rep(1:5, 20), x=rnorm(100),
> y=rnorm(100), key=c("t","g"))
> DT2 <- data.table( t=rep(1:10,10), x=rnorm(100), y=rnorm(100), key="t")
>
>
> Now I make a function that aggregates all values of "t" together
>
> F <- function(dt) {
> dt[, list(x=sum(x), y=sum(y)), by=setdiff(key(dt), "t")]
> }
>
>
>> F(DT1)
> g x y
> [1,] 1 -1.829979 -3.320561
> [2,] 2 -4.822312 5.136586
> [3,] 3 6.326729 2.298288
> [4,] 4 4.226714 3.267511
> [5,] 5 -3.277370 -3.474824
>
>> F(DT2)
> Error in l[[1]] : subscript out of bounds
>
>
> I would like F(DT2) to work. It should be the same as calling dt[,
> list(x=sum(x), y=sum(y))]. Do you see any pathologic cases where that
> wouldn't make sense as the default? As it is now, I have to check
> every time if setdiff(key(dt), "t") is empty, and then do a dt call
> without the by. This is messy and encourages too much copy paste
> errors.
>
> Thanks!
> Chris
More information about the datatable-help
mailing list