[datatable-help] 1.7.0 submitted to CRAN
Matthew Dowle
mdowle at mdowle.plus.com
Wed Oct 19 22:59:58 CEST 2011
NEW FEATURES
o data.table() now accepts list columns directly rather than
needing to add list columns to an existing data.table; e.g.,
DT = data.table(x=1:3,y=list(4:6,3.14,matrix(1:12,3)))
Thanks to Branson Owen for reminding. As before, list columns
can be created via grouping; e.g.,
DT = data.table(x=c(1,1,2,2,2,3,3),y=1:7)
DT2 = DT[,list(list(unique(y))),by=x]
DT2
x V1
[1,] 1 1, 2
[2,] 2 3, 4, 5
[3,] 3 6, 7
and list columns can be grouped; e.g.,
DT2[,sum(unlist(V1)),by=list(x%%2)]
x V1
[1,] 1 16
[2,] 0 12
Accordingly, one item has been added to FAQ 2.17 (differences
between data.frame and data.table): data.frame(list(1:2,"k",1:4))
creates 3 columns, data.table creates one list column.
o subset, transform and within now retain keys when the expression
does not 'touch' key columns, implemeting FR #1341.
o Recycling list() items on RHS of := now works; e.g.,
DT[,1:4:=list(1L,NULL),with=FALSE]
# set columns 1 and 3 to 1L and remove columns 2 and 4
o Factor columns on LHS of :=, [<- and $<- can now be assigned
new levels; e.g.,
DT = data.table(A=c("a","b"))
DT[2,"A"] <- "c" # adds new level automatically
DT[2,A:="c"] # same (faster)
DT$A = "newlevel" # adds new level and recycles it
Thanks to Damian Betebenner and Chris Neff for highlighting.
To change the type of a column, provide a full length RHS (i.e.
'replace' the column).
BUG FIXES
o := with i all FALSE no longer sets the whole column, fixing
bug #1570. Thanks to Chris Neff for reporting.
o 0 length by (such as NULL and character(0)) now behave as
if by is missing, fixing bug #1599. This is useful when by
is dynamic and a 'dont group' needs to be represented.
Thanks to Chris Neff for reporting.
o NULL j no longer results in 'inconsistent types' error, but
instead returns no rows for that group, fixing bug #1576.
o matrix i is now an error rather than using i as if it were a
vector and obtaining incorrect results. It was undocumented that
matrix might have been an acceptable type. matrix i is
still acceptable in [<-; e.g.,
DT[is.na(DT)] <- 1L
and this now works rather than assigning to non-NA items in some
cases.
o Inconsistent [<- behaviour is now fixed (#1593) so these examples
now work :
DT[x == "a", ]$y <- 0L
DT["a", ]$y <- 0L
But, := is highly encouraged instead for speed; i.e.,
DT[x == "a", y:=0L]
DT["a", y:=0L]
Thanks to Leon Baum for reporting.
o unique on an unsorted table now works, fixing bug #1601.
Thanks to a question by Iterator on Stack Overflow.
o Bug fix #1534 in v1.6.5 (see NEWS below) only worked if data.table
was higher than IRanges on the search() path, despite the item in
NEWS stating otherwise. Fixed.
o Compatibility with package sqldf (which can call do.call
("rbind",...) on an empty "...") is fixed and test added. data.table
was switching on list(...)[[1]] rather than ..1. Thanks to RYogi for
reporting #1623.
USER VISIBLE CHANGES
o cbind and rbind are no longer masked. But, please do read FAQ 2.23,
4.4 and 5.1.
More information about the datatable-help
mailing list