[datatable-help] List-valued column

Matthew Dowle mdowle at mdowle.plus.com
Wed Oct 26 23:04:38 CEST 2011


There was an item with examples in NEWS for 1.7.0 (pasted below). You
don't need the I(). However, there seems to be a problem where the first
column is a list column. This probably slipped through because usually
the first few columns are the key and list columns can't be key columns.
Thanks for letting us know. Will fix and add a test ...

list columns and := are quite recent and need full documentation so in
the meantime NEWS needs to be followed closely.

There is a FAQ documenting the differences between data.frame and
data.table. How list() is different has been added there.

Here's the NEWS item from 1.7.0  :

o   data.table() now accepts list columns directly rather than
    needing to add list columns to an existing data.table; e.g.,
        
        DT = data.table(x=1:3,y=list(4:6,3.14,matrix(1:12,3)))
            
    Thanks to Branson Owen for reminding. As before, list columns
    can be created via grouping; e.g.,
        
        DT = data.table(x=c(1,1,2,2,2,3,3),y=1:7)
        DT2 = DT[,list(list(unique(y))),by=x]
        DT2
             x      V1
        [1,] 1    1, 2
        [2,] 2 3, 4, 5
        [3,] 3    6, 7
            
    and list columns can be grouped; e.g.,
      
        DT2[,sum(unlist(V1)),by=list(x%%2)]
             x V1
        [1,] 1 16
        [2,] 0 12
            
    Accordingly, one item has been added to FAQ 2.17 (differences
    between data.frame and data.table): data.frame(list(1:2,"k",1:4))
    creates 3 columns, data.table creates one list column.


The last paragraph isn't quite correct where the first column is
concerned :

> data.table(a=list(1:2,"k",1:4))  # wrong
     V1 k V3
[1,]  1 k  1
[2,]  2 k  2
[3,]  1 k  3
[4,]  2 k  4
> data.table(a=1:3,b=list(1:2,"k",1:4))  # correct
     a          b
[1,] 1       1, 2
[2,] 2          k
[3,] 3 1, 2, 3, 4
> 

Thanks,
Matthew


On Wed, 2011-10-26 at 12:04 -0700, macrakis wrote:
> I'm appreciating the functionality and speed of data.table, but I can't
> figure out how to manipulate list-valued columns:
> 
> Compare:
> 
> > df <- data.frame(a=I(list(1:2,5:7))); df
>         a
> 1    1, 2
> 2 5, 6, 7
> > dt <- data.table(a=I(list(1:2,5:7))); dt
> Error in data.table(a = I(list(1:2, 5:7))) : 
>   arguments cannot be silently repeated to match max nr: 2, 3
> 
> How do I go about putting variable-length objects into a column of a data
> table?  The documentation says that the "..." in data.table(...) is just as
> in data.frame.
> 
> The documentation also says that POSIXlt is not supported "because it uses
> 40 bytes to store a single datetime".  Apparently the limitation is not just
> on POSIXlt, but on other non-atomic vector types as well?  
> 
> Thanks,
> 
>         -s
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/List-valued-column-tp3941856p3941856.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list