[datatable-help] Summing over many variables. A new approach; a new problem

Matthew Dowle mdowle at mdowle.plus.com
Thu Jan 13 22:07:49 CET 2011


Hi Joseph,
You've found feature request #1092 'Make 'by' work for list() columns' :
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1092&group_id=240&atid=978

Notes on the FR have this though :
   Currently type 19 isn't supported in dogroups (both input and
output). This might be straightforward (with luck) to implement.
   See
http://r.789695.n4.nabble.com/Suggest-a-cool-feature-Use-data-table-like-a-sorted-indexed-data-list-tp2544213p2544213.html
   Note this is related but different to FR#202 since a list() column
*is* a vector [is.vector()=TRUE].

Matthew

On Thu, 2011-01-13 at 15:17 -0500, Joseph Voelkel wrote:
> > #create matrix that includes list elements A
> 
> >
> (mat<-cbind(index=1:3,var=101:103,A=c(list(11:15),list(21:25),list(31:41))))
> 
>      index var A         
> 
> [1,] 1     101 Integer,5 
> 
> [2,] 2   &nbsp ; 102 In class=MsoNormal>[3,] 3     103 Integer,11
> 
> > class(mat)
> 
> [1] "matrix"
> 
> > # convert to data frame and "fix" the first two entries
> 
> > (df<-as.data.frame(mat))
> 
>   index var                                          A
> 
> 1     1 101                         11, 12, 13, 14, 15
> 
> 2     2 102          &n bsp;&nbs ;         21, 22, 23, 24, 25
> 
> 3     3 103 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41
> 
> > class(df$index) # because mat is atomic
> 
> [1] "list"
> 
> > df$index<-as.integer(df$index) # convert to integer
> 
> > df$var<-as.integer(df$var) # likewise
> 
> > # conver to data table
> 
> > dt<-data.table(df)
> 
> > setkey(dt,index)
> 
> > 
> 
> > # try some operations
> 
> > dt[,A] # works
> 
> [[1]]
> 
> [1] 11 12 13 14 15
> 
>  
> 
> [[2]]< /p>
> 
>  
> 
> [[3]]
> 
> [1] 31 32 33 34 35 36 37 38 39 40 41
> 
>  
> 
> > dt[,mean(A)] # Does not work. each row of A is a list
> 
> [1] NA
> 
> Warning message:
> 
> In mean.default(A) : argument is not numeric or logical: returning NA
> 
> > dt[,mean(unlist(A))] # But here is an easy fix to make this work
> 
> [1] 27.42857
> 
> > 
> 
> > dt[,mean(var),by=index] # works (of course)
> 
>      index  V1
> 
> [1,]     1 101
> 
> [2,]     2 102
> 
> [3, 3 103
> 
> > 
> 
> > dt[,mean(unlist(A)),by=index] # does not work! 
> 
> Error in `[.data.table`(dt, , mean(unlist(A)), by = index) : 
> 
>   only integer,double,logical and character vectors are allowed so
> far. Type 19 would need to be added.
> 
> > 
> 
> > 
> 
>  
> 
> #### Pure code ####
> 
> #create matrix that includes list elements A
> 
> (mat<-cbind(index=1:3,var=101:103,A=c(list(11:15),list(21:25),list(31:41))))
> 
> class(mat)
> 
> # convert to data frame and "fix" the first two entries
> 
> (df<-as.data.frame(mat))
> 
> class(df$ind ex) # be /o:p>
> 
> df$index<-as.integer(df$index) # convert to integer
> 
> df$var<-as.integer(df$var) # likewise
> 
> # conver to data table
> 
> dt<-data.table(df)
> 
> setkey(dt,index)
> 
>  
> 
> # try some operations
> 
> dt[,A] # works
> 
> dt[,mean(A)] # Does not work. each row of A is a list
> 
> dt[,mean(unlist(A))] # But here is an easy fix to make this
> 
>  
> 
> dt[,mean(var),by=index] # works (of course)
> 
>  
> 
> dt[,mean(unlist(A)),by=index] # does not work! 
> 
>   



More information about the datatable-help mailing list