[datatable-help] "ungrouping" a data.table
statquant3
statquant at outlook.com
Fri Jan 8 14:34:37 CET 2016
I am trying to replicate the kdb ungroup function
Say you have a table with several list columns, each list have same number
of elements on the same row
R) t = Sys.time()
R)
DT=data.table(a=c(1,2,3),b=c('q','w','e'),c=list(rep(t,2),rep(t+1,3),rep(t,0)),d=list(rep(1,2),rep(20,3),rep(1,0)))
R) DT
a b c
d
1: 1 q 2016-01-08 13:45:04.16544,2016-01-08 13:45:04.16544 1,1
2: 2 w 2016-01-08 13:45:05.16544,2016-01-08 13:45:05.16544,2016-01-08
13:45:05.16544 20,20,20
3: 3 e
The idea is to unlist all list columns keeping the non-list unchanged
I have the following:
dtUngroup <- function(DT){
colClasses <- lapply(DT,FUN=class)
listCols <- which(colClasses=='list')
if(length(listCols)>0){
nonListCols <- setdiff(colnames(DT),listCols)
DT[,nbListElem:=lapply(.SD,FUN=lengths),.SDcols=(listCols[1L])]
DT1 <- DT[,lapply(.SD,FUN=rep,times=DT$nbListElem),.SDcols=(nonListCols)]
DT1[,(listCols):=DT[,lapply(.SD,FUN=unlist),.SDcols=(listCols)]]
DT1[,nbListElem:=NULL]
return(DT1)
}
return(DT)
}
R) dtUngroup(DT)[]
a b c d
1: 1 q 1452260946 1
2: 1 q 1452260946 1
3: 2 w 1452260947 20
4: 2 w 1452260947 20
5: 2 w 1452260947 20
Buy as you can see
1. it is verbose
2. empty lists are unsupported
3. POSIXct type is downcasted to numeric
Any idea how to fix those ?
--
View this message in context: http://r.789695.n4.nabble.com/ungrouping-a-data-table-tp4716265.html
Sent from the datatable-help mailing list archive at Nabble.com.
More information about the datatable-help
mailing list