[datatable-help] Extract Single Column as Vector

Ricardo Saporta saporta at scarletmail.rutgers.edu
Sat May 18 03:34:52 CEST 2013


Hm... Eddi does seem to have a point here.    While I agree with Frank that
once you're used to it, it is rather straightforward to deal with, I can
see why one would have the expectation of a vector.   ie, that the last of
the following `identical` statements should evaluate to `TRUE`

    df <- as.data.frame(dt)

    > identical(df[, "a"], dt[, get("a")])
    [1] TRUE
    > identical(df[, "a"], dt[["a"]])
    [1] TRUE
    > identical(df[, "a"], dt[, "a", with=FALSE])
    [1] FALSE

    rm(df)


-Rick


Ricardo Saporta
Graduate Student, Data Analytics
Rutgers University, New Jersey
e: saporta at rutgers.edu



On Fri, May 17, 2013 at 4:26 PM, Eduard Antonyan
<eduard.antonyan at gmail.com>wrote:

> Well, looking at the documentation:
>
> j: A single column name, single expresson of column names, list() of
> expressions of column names, an expression or function call that evaluates
> to list (including data.frame and data.table which are lists, too), or *(when
> with=FALSE) same as j in [.data.frame.*
> ...
> with:* *By default with=TRUE and j is evaluated within the frame of x.
> The column names can be used as variables. *When with=FALSE, j works as
> it does in [.data.frame.*
>
> *
> *
> The bolded out part of the documentation doesn't match the actual behavior.
>
>
>
> On Fri, May 17, 2013 at 2:44 PM, Frank Erickson <FErickson at psu.edu> wrote:
>
>> @Arun and eddi: This question has come up before.
>>
>> http://r.789695.n4.nabble.com/Better-hacks-getting-a-vector-AND-using-with-inserting-chunks-of-rows-tt4666592.html
>> (And I'm sure there are other times, too.) I can't say I've heard anyone
>> arguing about it, though. :)
>>
>> I guess it works that way because
>> ...in dt[ ,a], j is an expression which evaluates to a vector
>> ...in dt[,"a",with=FALSE] the option turns on the "you must want one or
>> more columns" mode, translating j from "a" to list(a)
>>
>> It's unintuitive if you're expecting data frame behavior (you know,
>> drop=TRUE, as Arun mentioned), but if you've already seen dt[,list(a)], it
>> shouldn't be much of a surprise. Adding the drop option, and maybe
>> defaulting it to TRUE when with=FALSE might satisfy eddi's concern...?
>>
>>
>> On Fri, May 17, 2013 at 10:22 AM, Eduard Antonyan <
>> eduard.antonyan at gmail.com> wrote:
>>
>>> I don't remember discussing this issue...? What is the conceptual
>>> difference between dt[, a] and dt[, "a", with = F] and what does 'drop'
>>> have to do with this??
>>>
>>>
>>> On Fri, May 17, 2013 at 10:02 AM, Arunkumar Srinivasan <
>>> aragorn168b at gmail.com> wrote:
>>>
>>>>  Eduard, are we discussing the same thing again :)? Wasn't this somehow
>>>> your question as well.. the discrepancy between:
>>>>
>>>> dt[, a] and dt[, "a", with=FALSE].
>>>>
>>>> There should be a drop=TRUE/FALSE option (as in the case of data.frame)
>>>> that should be used when you use `with=FALSE`. Until then, the default
>>>> option seems to be drop=FALSE, which results in a data.table.
>>>>
>>>> Alexandre, as of now, it could be done as Eduard points out.
>>>>
>>>> Arun
>>>>
>>>> On Friday, May 17, 2013 at 4:59 PM, Eduard Antonyan wrote:
>>>>
>>>> Use dt[[colname]], but this seems like a bug to me - I would've thought
>>>> that dt[, a] and dt[, "a", with = F] should return the exact same thing.
>>>>
>>>>
>>>> On Fri, May 17, 2013 at 9:42 AM, Alexandre Sieira <
>>>> alexandre.sieira at gmail.com> wrote:
>>>>
>>>> Sorry if this is a basic question.
>>>>
>>>>
>>>> I'm using R 3.0.0 and data.table 1.8.8. The documentation for 'j'
>>>> states that "A single column or single expression returns that type,
>>>> usually a vector."
>>>>
>>>>
>>>> I am able to obtain this behavior if I know the column name in advance:
>>>>
>>>>
>>>> > dt = data.table(a=c(1, 2, 3), b=c(4, 5, 6))
>>>>
>>>> > dt
>>>>
>>>>    a b
>>>>
>>>> 1: 1 4
>>>>
>>>> 2: 2 5
>>>>
>>>> 3: 3 6
>>>>
>>>> > str(dt[,a])
>>>>
>>>>  num [1:3] 1 2 3
>>>>
>>>>
>>>> However, if I don't, no such luck:
>>>>
>>>> > colname="a"
>>>> > str(dt[,colname,with=F])
>>>> Classes ‘data.table’ and 'data.frame': 3 obs. of  1 variable:
>>>>  $ a: num  1 2 3
>>>>  - attr(*, ".internal.selfref")=<externalptr>
>>>>
>>>> If there a way to extract an entire column as a vector if I have the
>>>> column name as a character scalar?
>>>>
>>>> Thank you!
>>>>
>>>> --
>>>> Alexandre Sieira
>>>> CISA, CISSP, ISO 27001 Lead Auditor
>>>>
>>>> "The truth is rarely pure and never simple."
>>>> Oscar Wilde, The Importance of Being Earnest, 1895, Act I
>>>>
>>>> _______________________________________________
>>>> datatable-help mailing list
>>>> datatable-help at lists.r-forge.r-project.org
>>>>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>
>>>>
>>>> _______________________________________________
>>>> datatable-help mailing list
>>>> datatable-help at lists.r-forge.r-project.org
>>>>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> datatable-help mailing list
>>> datatable-help at lists.r-forge.r-project.org
>>>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>
>>
>>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130517/aead1825/attachment.html>


More information about the datatable-help mailing list