[datatable-help] 1.7.0 submitted to CRAN

Matthew Dowle mdowle at mdowle.plus.com
Wed Oct 19 22:59:58 CEST 2011


NEW FEATURES

o   data.table() now accepts list columns directly rather than
    needing to add list columns to an existing data.table; e.g.,
        
        DT = data.table(x=1:3,y=list(4:6,3.14,matrix(1:12,3)))
            
    Thanks to Branson Owen for reminding. As before, list columns
    can be created via grouping; e.g.,
        
        DT = data.table(x=c(1,1,2,2,2,3,3),y=1:7)
        DT2 = DT[,list(list(unique(y))),by=x]
        DT2
             x      V1
        [1,] 1    1, 2
        [2,] 2 3, 4, 5
        [3,] 3    6, 7
            
    and list columns can be grouped; e.g.,
        
        DT2[,sum(unlist(V1)),by=list(x%%2)]
             x V1
        [1,] 1 16
        [2,] 0 12
            
    Accordingly, one item has been added to FAQ 2.17 (differences
    between data.frame and data.table): data.frame(list(1:2,"k",1:4))
    creates 3 columns, data.table creates one list column.

o   subset, transform and within now retain keys when the expression
    does not 'touch' key columns, implemeting FR #1341.

o   Recycling list() items on RHS of := now works; e.g.,
    
        DT[,1:4:=list(1L,NULL),with=FALSE]
        # set columns 1 and 3 to 1L and remove columns 2 and 4
            
o   Factor columns on LHS of :=, [<- and $<- can now be assigned
    new levels; e.g.,
        
        DT = data.table(A=c("a","b"))
        DT[2,"A"] <- "c"  # adds new level automatically
        DT[2,A:="c"]      # same (faster)
        DT$A = "newlevel" # adds new level and recycles it
            
    Thanks to Damian Betebenner and Chris Neff for highlighting.
    To change the type of a column, provide a full length RHS (i.e.
    'replace' the column).        

BUG FIXES

o   := with i all FALSE no longer sets the whole column, fixing
    bug #1570. Thanks to Chris Neff for reporting.
        
o   0 length by (such as NULL and character(0)) now behave as
    if by is missing, fixing bug #1599. This is useful when by
    is dynamic and a 'dont group' needs to be represented.
    Thanks to Chris Neff for reporting.
        
o   NULL j no longer results in 'inconsistent types' error, but
    instead returns no rows for that group, fixing bug #1576.
        
o   matrix i is now an error rather than using i as if it were a
    vector and obtaining incorrect results. It was undocumented that
    matrix might have been an acceptable type. matrix i is
    still acceptable in [<-; e.g.,
        DT[is.na(DT)] <- 1L
    and this now works rather than assigning to non-NA items in some
    cases.
        
o   Inconsistent [<- behaviour is now fixed (#1593) so these examples
    now work :
        DT[x == "a", ]$y <- 0L
        DT["a", ]$y <- 0L
    But, := is highly encouraged instead for speed; i.e.,
        DT[x == "a", y:=0L]
        DT["a", y:=0L]
    Thanks to Leon Baum for reporting.
        
o   unique on an unsorted table now works, fixing bug #1601.
    Thanks to a question by Iterator on Stack Overflow.
        
o   Bug fix #1534 in v1.6.5 (see NEWS below) only worked if data.table
    was higher than IRanges on the search() path, despite the item in
    NEWS stating otherwise. Fixed.
        
o   Compatibility with package sqldf (which can call do.call
    ("rbind",...) on an empty "...") is fixed and test added. data.table
    was switching on list(...)[[1]] rather than ..1. Thanks to RYogi for
    reporting #1623.
        
USER VISIBLE CHANGES

o   cbind and rbind are no longer masked. But, please do read FAQ 2.23,
    4.4 and 5.1.





More information about the datatable-help mailing list