[datatable-help] rbindlist and factors

Arunkumar Srinivasan aragorn168b at gmail.com
Tue May 21 20:14:35 CEST 2013


You can download 1.8.9 from r-forge and use it. If you're much concerned, you can use devtools and install 1.8.9 in dev mode as follows:  

>require(devtools)
>dev_mode(TRUE)
d> install.packages("data.table", repos="http://R-Forge.R-project.org", type="source")
d> require(data.table)
d> # do whatever calculations you want
d> dev_mode(FALSE)
> # returns to normal session

Arun


On Tuesday, May 21, 2013 at 8:11 PM, Alexandre Sieira wrote:

> Thank you, I'll wait for the next release then.  
>  
> It's do.call("rbind", …) till then, I presume. :)  
> --  
> Alexandre Sieira
> CISA, CISSP, ISO 27001 Lead Auditor
>  
> "The truth is rarely pure and never simple."
> Oscar Wilde, The Importance of Being Earnest, 1895, Act I
>  
> On 21 de maio de 2013 at 15:09:00, Arunkumar Srinivasan (aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)) wrote:
>  
> > This was already addressed here:  
> > http://stackoverflow.com/questions/15933846/rbindlist-two-data-tables-where-one-has-factor-and-other-has-character-type-for
> >  
> > And was known to be a bug filed here:
> > https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2650&group_id=240&atid=975
> >  
> > Which has been fixed in the current development version 1.8.9. (
> > Fixed by commit 879 in v1.8.9
> >  
> >  
> >  
> > Hope this helps,
> > Arun
> >  
> >  
> > On Tuesday, May 21, 2013 at 8:06 PM, Alexandre Sieira wrote:
> >  
> > > I think I found an unexpected behavior with rbindlist when columns are factors:
> > >  
> > > > dt1 = data.table(a=as.factor(c("a", "a", "a")))
> > >  
> > > > dt1
> > >    a
> > > 1: a
> > > 2: a
> > > 3: a
> > > > str(dt1)
> > > Classes ‘data.table’ and 'data.frame': 3 obs. of  1 variable:
> > >  $ a: Factor w/ 1 level "a": 1 1 1
> > >  - attr(*, ".internal.selfref")=<externalptr>  
> > > > dt2 = data.table(a=as.factor(c("b", "b", "b")))
> > > > dt2
> > >    a
> > > 1: b
> > > 2: b
> > > 3: b
> > > > str(dt2)
> > > Classes ‘data.table’ and 'data.frame': 3 obs. of  1 variable:
> > >  $ a: Factor w/ 1 level "b": 1 1 1
> > >  - attr(*, ".internal.selfref")=<externalptr>  
> > >  
> > > If I rbind them, I get the expected value - a table with 6 rows, 3 of which have value "a" and 3 with value "b":
> > >  
> > > > rbind(dt1, dt2)
> > >    a
> > > 1: a
> > > 2: a
> > > 3: a
> > > 4: b
> > > 5: b
> > > 6: b
> > >  
> > >  
> > > So if I do rbindlist(list(dt1, dt2)), I would expect to get the exact same result, only faster. Unfortunately, that is not the case:
> > >  
> > > > rbindlist(list(dt1, dt2))
> > >    a
> > > 1: a
> > > 2: a
> > > 3: a
> > > 4: a
> > > 5: a
> > > 6: a
> > >  
> > > > str(rbindlist(list(dt1, dt2)))
> > > Classes ‘data.table’ and 'data.frame': 6 obs. of  1 variable:
> > >  $ a: Factor w/ 1 level "a": 1 1 1 1 1 1
> > >  - attr(*, ".internal.selfref")=<externalptr>  
> > >  
> > >  
> > > This was executed with R 3.0.1 and data.table 1.8.8 on a Mac OS X 10.8.3.
> > >  
> > > Is this expected behavior? Am I missing something?
> > >  
> > >  
> > >  
> > > --  
> > > Alexandre Sieira
> > > CISA, CISSP, ISO 27001 Lead Auditor
> > >  
> > > "The truth is rarely pure and never simple."
> > > Oscar Wilde, The Importance of Being Earnest, 1895, Act I
> > > _______________________________________________
> > > datatable-help mailing list
> > > datatable-help at lists.r-forge.r-project.org (mailto:datatable-help at lists.r-forge.r-project.org)
> > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > >  
> > >  
> > >  
> >  
> >  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130521/d40720b0/attachment.html>


More information about the datatable-help mailing list