[datatable-help] Bug when by=key(DT)
Steve Lianoglou
mailinglist.honeypot at gmail.com
Thu Feb 24 04:05:27 CET 2011
Hi,
I'm running data.table 1.5.4 (but this also fails w/ data.table in SVN).
One of the bullet points in the news for version 1.5.3 was:
o 'by' may now be a character vector of column names.
This allows syntax such as DT[,sum(x),by=key(DT)].
But when the result of the subgroup iteration/summary returns less
rows than the original subgroup, it fails.
For example:
R> library(data.table)
R> dt <- data.table(name=c('a', 'a', 'a', 'b', 'b', 'c', 'c', 'c'),
start=sample(1:50, 8))
R> dt$end <- dt$start + sample(1:50, 8)
R> key(dt) <- 'name'
This is OK:
R> dt[, list(start=max(start), end=max(end)), by='name']
name start end
[1,] a 47 69
[2,] b 35 48
[3,] c 26 52
This isn't:
R> dt[, list(start=max(start), end=max(end)), by=key(dt)]
Error in `[[<-.data.frame`(`*tmp*`, jj, value = 1:3) :
replacement has 3 rows, data has 8
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the datatable-help
mailing list