[datatable-help] rbindlist on list of data.frames with factor column
Ricardo Saporta
saporta at scarletmail.rutgers.edu
Thu Mar 28 17:52:57 CET 2013
Hello,
I found that when using `rbindlist` on a list of data.frames with factor
columns, the factor column is getting concat'd as its numeric equivalent.
This of course, does not happen when using a list of data.tables.
# sample data, using data.frame
sampleList.DF <- lapply(LETTERS[1:5], function(L)
data.frame(Val1=rnorm(3), Val2=runif(3), FactorCol=L) )
sampleList.DF <- lapply(sampleList.DF, function(x)
{x$StringCol <- as.character(x$FactorCol); x})
# sample data, using data.table
sampleList.DT <- lapply(LETTERS[1:5], function(L)
data.table(Val1=rnorm(3), Val2=runif(3), FactorCol=L) )
sampleList.DT <- lapply(sampleList.DT, function(x)
x[, StringCol := as.character(FactorCol)])
# Compare the column `FactorCol`:
rbindlist(sampleList.DT)
rbindlist(sampleList.DF)
do.call(rbind, sampleList.DF)
Interestingly, I originally thought it was levels dependent:
(I would have expected, for example, the following to allow for the levels
of the third list element, but it does not).
sampleList.DF[[1]][, "FactorCol"] <- factor(c("A", "C", "A"))
# all the levels in third element are present in the first
all(levels(sampleList.DF[[3]][, "FactorCol"]) %in%
levels(sampleList.DF[[1]][, "FactorCol"]))
# [1] TRUE
But...
rbindlist(sampleList.DF)
However:
sampleList.DF[[1]][, "FactorCol"] <- factor(c("C", "A", "A"),
levels=c("C", "A"))
rbindlist(sampleList.DF)
Is the above behavior intended?
Cheers,
Rick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130328/e10006ba/attachment.html>
More information about the datatable-help
mailing list