[datatable-help] Subsetting that behaves right for both data frames and data.tables?

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Jul 20 14:25:48 CEST 2011


Hi,

On Wed, Jul 20, 2011 at 8:06 AM, Chris Neff <caneff at gmail.com> wrote:
> I'm used to seeing the column names at the bottom of the column too, but
> that is only if the data.table is long enough. My example was too short for
> that, so I made the same sort of mistake you did :(
> Okay, that is a way, but is it a good way? Not sure...

using `subset` on a data.table is, for the most part, fine.

The only problem is that the data.table that comes out of subset (and
transform, for that matter) will not have the same keys (any, for that
matter) that were set on the data.table that went into subset.

FYI, I've been meaning to tweak it so that it does, but haven't yet.

-steve

>
> 2011/7/20 Timothée Carayol <timothee.carayol at gmail.com>
>>
>> Sorry my mistake -- subset does return a data.table.
>> (I was using as an example a data.table with 100 rows, and stupidly using
>> the fact that it printed the whole thing rather than the 10 first rows only
>> as my criterion for whether it worked or not.. Omitting that
>> print.data.table does print up to 100 rows. I feel a bit stupid.)
>> Why doesn't it work for you if that is the case?
>>
>> DF <- data.frame(a=1:200, b=1:10)
>> DT <- as.data.table(DF)
>> subDT <- subset(DT, select=a)
>> class(DT)
>> subDF <- subset(DF, select=a)
>> class(DF)
>> identical(as.data.frame(DT), DF)
>>
>>
>> On Wed, Jul 20, 2011 at 12:50 PM, Chris Neff <caneff at gmail.com> wrote:
>>>
>>> Yeah I realized that myself.
>>> Another one: the function "with" doesn't seem to do what I want... but at
>>> least it is consistent!
>>>
>>> 2011/7/20 Timothée Carayol <timothee.carayol at gmail.com>
>>>>
>>>> Sorry --
>>>> subset() was a poor idea, as it will return a data.frame even if the
>>>> argument is a data.table..
>>>>
>>>>
>>>>
>>>> 2011/7/20 Timothée Carayol <timothee.carayol at gmail.com>
>>>>>
>>>>> Hi--
>>>>> You can use the subset() command with the select= option; not sure it's
>>>>> the best solution, though.
>>>>>
>>>>> Timothee
>>>>>
>>>>>
>>>>> On Wed, Jul 20, 2011 at 12:26 PM, Chris Neff <caneff at gmail.com> wrote:
>>>>>>
>>>>>> I have a function where I pass a data frame and some variable names to
>>>>>> calculate statistics on. However, I am at a loss as to how to write it
>>>>>> correctly so that both data.frame and data.table work with it. If I have:
>>>>>> DF = data.frame(x=1:10,y=2:11,z=3:12)
>>>>>> DT = data.table(DF)
>>>>>> var.names = c("x","y")
>>>>>>
>>>>>> I can do the following things to subset:
>>>>>> DT[,var.names,with=FALSE]
>>>>>> DF[,var.names]
>>>>>>
>>>>>> but of course DT[,var.names] won't give me back what I want, and
>>>>>> DF[,var.names,with=FALSE] returns an error because with doesn't exist there.
>>>>>> So how do I do this?
>>>>>> Thanks,
>>>>>> -Chris
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> datatable-help mailing list
>>>>>> datatable-help at lists.r-forge.r-project.org
>>>>>>
>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the datatable-help mailing list