[datatable-help] BUG: droplevels mangles subsetted data.table
Prasad Chalasani
pchalasani at gmail.com
Tue Feb 21 21:38:10 CET 2012
Meanwhile as a work-around, I suppose one should do:
keys <- key( dt ) # this could in general be a large set of keys
sub_d <- droplevels( as.data.frame( dt[ name != 'a' ] ) )
sub_dt <- data.table( sub_d )
setkeyv( sub_dt, keys )
On Feb 21, 2012, at 1:59 PM, Matthew Dowle wrote:
>
> I see the problem too but (just) adding droplevels.data.table might miss
> the root cause.
>
>> because the way the
>> droplevels.data.frame method works isn't compatible with data.table
>> indexing.
>
> But it's intended to be. I can see the switch at the top of [.data.table
> is detecting the caller isn't data.table aware, and it is then dispatching
> to `[.data.frame` but why it then isn't working I'm not sure. Something to
> do with the missing j or missing drop not being passed through correctly,
> perhaps.
>
> I have heard it said (once or twice) that data.table is "almost"
> compatible with non-data.table-aware packages, but never had an example
> before. I wonder if this is it!
>
> A (fast) droplevels.data.table using := would be good anyway, though.
>
> Matthew
>
>
>
>> Hi,
>>
>> I see what the problem is -- we need to provide a
>> droplevels.data.table S3 method, because the way the
>> droplevels.data.frame method works isn't compatible with data.table
>> indexing.
>>
>> Will fix:
>>
>> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1841&group_id=240&atid=975
>>
>> Thanks for raising the flag.
>>
>> Cheers,
>> -steve
>>
>> On Tue, Feb 21, 2012 at 12:38 PM, pchalasani <pchalasani at gmail.com> wrote:
>>> Surprising that this wasn't noticed before, or perhaps I'm not
>>> following
>>> some recommended idiom to drop levels when using data.table. The
>>> following
>>> code illustrates the bug clearly: The bug remains regardless of whether
>>> I
>>> use "subset" or simply use dt1 = dt[ name != 'a' ].
>>>
>>>
>>>
>>> d <- data.table(name = c('a','b','c'), value = 1:3)
>>> dt <- data.table(d)
>>> setkey(dt,'name')
>>> dt1 <- subset(dt,name != 'a') # or dt1 <- dt[ name != 'a' ]
>>> > dt1
>>> name value
>>> [1,] b 2
>>> [2,] c 3
>>>
>>> > droplevels(dt1)
>>> name value
>>> [1,] b 1
>>> [2,] c 3
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/BUG-droplevels-mangles-subsetted-data-table-tp4407694p4407694.html
>>> Sent from the datatable-help mailing list archive at Nabble.com.
>>> _______________________________________________
>>> datatable-help mailing list
>>> datatable-help at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>
>
More information about the datatable-help
mailing list