[datatable-help] BUG: droplevels mangles subsetted data.table

Prasad Chalasani pchalasani at gmail.com
Tue Feb 21 21:38:10 CET 2012


Meanwhile as a work-around, I suppose one should do:

keys <- key( dt ) # this could in general be a large set of keys
sub_d <- droplevels( as.data.frame( dt[ name != 'a' ] ) )
sub_dt <- data.table( sub_d )
setkeyv( sub_dt, keys )



On Feb 21, 2012, at 1:59 PM, Matthew Dowle wrote:

> 
> I see the problem too but (just) adding droplevels.data.table might miss
> the root cause.
> 
>> because the way the
>> droplevels.data.frame method works isn't compatible with data.table
>> indexing.
> 
> But it's intended to be. I can see the switch at the top of [.data.table
> is detecting the caller isn't data.table aware, and it is then dispatching
> to `[.data.frame` but why it then isn't working I'm not sure. Something to
> do with the missing j or missing drop not being passed through correctly,
> perhaps.
> 
> I have heard it said (once or twice) that data.table is "almost"
> compatible with non-data.table-aware packages, but never had an example
> before. I wonder if this is it!
> 
> A (fast) droplevels.data.table using := would be good anyway, though.
> 
> Matthew
> 
> 
> 
>> Hi,
>> 
>> I see what the problem is -- we need to provide a
>> droplevels.data.table S3 method, because the way the
>> droplevels.data.frame method works isn't compatible with data.table
>> indexing.
>> 
>> Will fix:
>> 
>> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1841&group_id=240&atid=975
>> 
>> Thanks for raising the flag.
>> 
>> Cheers,
>> -steve
>> 
>> On Tue, Feb 21, 2012 at 12:38 PM, pchalasani <pchalasani at gmail.com> wrote:
>>>  Surprising that this wasn't noticed before, or perhaps I'm not
>>> following
>>> some recommended idiom to drop levels when using  data.table. The
>>> following
>>> code illustrates the bug clearly: The bug remains regardless of whether
>>> I
>>> use "subset" or simply use dt1 = dt[ name != 'a' ].
>>> 
>>> 
>>> 
>>>    d <- data.table(name = c('a','b','c'), value = 1:3)
>>>    dt <- data.table(d)
>>>    setkey(dt,'name')
>>>    dt1 <- subset(dt,name != 'a')  # or dt1 <- dt[ name != 'a' ]
>>>    > dt1
>>>          name value
>>>     [1,]    b     2
>>>     [2,]    c     3
>>> 
>>>    > droplevels(dt1)
>>>          name value
>>>     [1,]    b     1
>>>     [2,]    c     3
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/BUG-droplevels-mangles-subsetted-data-table-tp4407694p4407694.html
>>> Sent from the datatable-help mailing list archive at Nabble.com.
>>> _______________________________________________
>>> datatable-help mailing list
>>> datatable-help at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>> 
>> 
>> 
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>  | Memorial Sloan-Kettering Cancer Center
>>  | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>> 
> 
> 



More information about the datatable-help mailing list