[datatable-help] column of named vectors in data.table and possible bug
Arunkumar Srinivasan
aragorn168b at gmail.com
Sat Aug 24 09:57:53 CEST 2013
Dear all,
Suppose we've construct a data.table in this manner:
x <- c(1,1,1,2,2)
y <- 6:10
setattr(y, 'names', letters[1:5])
DT<- data.table(A = x, B = y)
DT$B
a b c d e
6 7 8 9 10
You see that DT maintains the name of vector B. But if we do:
DT[, names(B), by=A]
A V1
1: 1 a
2: 1 b
3: 1 c
4: 2 a
5: 2 b
6: 2 c
There are two things here: First, you see that only the names of the first grouping is correct (A = 1). Second, the rest of the result has the same names, and the result is also recycled to fit the length. Instead of 5 rows, we get 6 rows.
A way to get around it would be:
DT[, names(DT$B)[.I], by=A]
A V1
1: 1 a
2: 1 b
3: 1 c
4: 2 d
5: 2 e
However, if one wants to do:
DT[, list(list(B)), by=A]$V1
[[1]]
a b c
6 7 8
[[2]]
a b
9 10
You see that the names are once again wrong (for A = 2). Just the first one remains right.
My question is, is it allowed usage of having names for column vectors? If so, then this should be a bug. If not, it'd be a great feature to have.
Arun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130824/76520421/attachment.html>
More information about the datatable-help
mailing list