[datatable-help] column of named vectors in data.table and possible bug

Arunkumar Srinivasan aragorn168b at gmail.com
Fri Sep 6 11:52:48 CEST 2013


Hi Thell, 

It's not late :). Thanks for your reply. Yes of course we could do the way you specified. But the usage for the feature I mentioned is quite different. I was thinking of doing something even more efficient for this question on SO (http://stackoverflow.com/questions/17308551/do-callrbind-list-for-uneven-number-of-column): 

Arun


On Thursday, September 5, 2013 at 7:41 PM, Thell Fowler wrote:

> Perhaps a 'too late' reply, but have you thought about bringing the names into the DT, using them, then dropping them?
> 
> For example:
> 
> > DT[, n:=names(DT$B)]
> > DT[,list(B=list(B),Names=list(n)),by=A]
>    A     B Names
> 1: 1 6,7,8 a,b,c
> 2: 2  9,10   d,e
> > DT$n<-NULL
> 
> 
> 
> On Sat, Aug 24, 2013 at 2:57 AM, Arunkumar Srinivasan <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > Dear all, 
> > 
> > Suppose we've construct a data.table in this manner:
> > 
> > x <- c(1,1,1,2,2)
> > y <- 6:10
> > setattr(y, 'names', letters[1:5])
> > DT<- data.table(A = x, B = y)
> > 
> > DT$B
> >  a  b  c  d  e 
> >  6  7  8  9 10 
> > 
> > 
> > You see that DT maintains the name of vector B. But if we do:
> > 
> > DT[, names(B), by=A] 
> >    A V1
> > 1: 1  a
> > 2: 1  b
> > 3: 1  c
> > 4: 2  a
> > 5: 2  b
> > 6: 2  c
> > 
> > 
> > There are two things here: First, you see that only the names of the first grouping is correct (A = 1). Second, the rest of the result has the same names, and the result is also recycled to fit the length. Instead of 5 rows, we get 6 rows. 
> > 
> > A way to get around it would be:
> > 
> > DT[, names(DT$B)[.I], by=A]
> >    A V1
> > 1: 1  a
> > 2: 1  b
> > 3: 1  c
> > 4: 2  d
> > 5: 2  e
> > 
> > 
> > However, if one wants to do:
> > 
> > DT[, list(list(B)), by=A]$V1
> > [[1]]
> > a b c 
> > 6 7 8 
> > 
> > [[2]]
> >  a  b 
> >  9 10 
> > 
> > 
> > You see that the names are once again wrong (for A = 2). Just the first one remains right. 
> > 
> > My question is, is it allowed usage of having names for column vectors? If so, then this should be a bug. If not, it'd be a great feature to have. 
> > 
> > Arun
> > 
> > 
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org (mailto:datatable-help at lists.r-forge.r-project.org)
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> 
> 
> 
> -- 
> Sincerely,
> Thell 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130906/0cda7a9b/attachment.html>


More information about the datatable-help mailing list