[datatable-help] column of named vectors in data.table and possible bug

Arunkumar Srinivasan aragorn168b at gmail.com
Sat Aug 24 09:57:53 CEST 2013


Dear all, 

Suppose we've construct a data.table in this manner:

x <- c(1,1,1,2,2)
y <- 6:10
setattr(y, 'names', letters[1:5])
DT<- data.table(A = x, B = y)

DT$B
 a  b  c  d  e 
 6  7  8  9 10 


You see that DT maintains the name of vector B. But if we do:

DT[, names(B), by=A]
   A V1
1: 1  a
2: 1  b
3: 1  c
4: 2  a
5: 2  b
6: 2  c


There are two things here: First, you see that only the names of the first grouping is correct (A = 1). Second, the rest of the result has the same names, and the result is also recycled to fit the length. Instead of 5 rows, we get 6 rows.

A way to get around it would be:

DT[, names(DT$B)[.I], by=A]
   A V1
1: 1  a
2: 1  b
3: 1  c
4: 2  d
5: 2  e


However, if one wants to do:

DT[, list(list(B)), by=A]$V1
[[1]]
a b c 
6 7 8 

[[2]]
 a  b 
 9 10 


You see that the names are once again wrong (for A = 2). Just the first one remains right. 

My question is, is it allowed usage of having names for column vectors? If so, then this should be a bug. If not, it'd be a great feature to have.

Arun

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130824/76520421/attachment.html>


More information about the datatable-help mailing list