[datatable-help] Odd behavior with [.data.table and J

Matthew Dowle mdowle at mdowle.plus.com
Tue Sep 4 20:11:54 CEST 2012


> On Mon, Aug 6, 2012 at 1:31 PM, Matthew Dowle
> <mdowle at mdowle.plus.com>wrote:
>
>> > Hello,
>> >
>> > I'm running into an odd behavior with data.table. Given the following
>> > variables:
>> >
>> >> tbl = data.table(foo=c(1,2,3), bar=c(1.1, 2.2, 3.3))
>> >> setkey(tbl, foo)
>> >> i = data.frame(foo=1)
>>
>> And
>>    i = data.table(foo=1)
>> gives the same results below, so it doesn't seem related to whether i is
>> a
>> data.table -vs- a data.frame.
>>
>> >
>> > ... I would expect the following three ways of indexing "tbl" using
>> "i"
>> to
>> > give the same result, but they don't:
>> >
>> >> tbl[i]
>> >    foo bar
>> > 1:   1 1.1
>> >> tbl[J(i)]
>> > Error in `[.data.table`(tbl, J(i)) :
>> >   typeof x.foo (double) != typeof i.V1 (list)
>> >> tbl[data.table(i)]
>> >    foo bar
>> > 1:   1 1.1
>> >
>> > Anything I'm missing on why tbl[J(i)] wouldn't work like the other
>> two?
>> Or
>> > have I hit a bug? I'm running R 2.15.1 64bit on Windows 7, with
>> data.table
>> > 1.8.2.
>>
>> J inside [] is an alias for list(), not data.table(). I don't think this
>> changed in 1.8.2 but might be wrong. data.table() is much heavier than
>> list() checking argument types up front and recycling vectors to ensure
>> each item of the data.table has the same length, for instance.
>> data.table() also unpacks data.frame and data.table arguments like a
>> cbind
>> would. Whereas list() treats data.table and data.frame arguments as
>> though
>> they are list columns.
>>
>> So it's doing something sensible and is correct behaviour on first
>> glance.
>> Please confirm this makes sense, and if so I'll add as as FR to improve
>> error message and documentation.
>>
>
> Okay, this makes sense. But pretty much the whole of the documentation on
> that topic is wrong, then. Hence my confusion.
>
> As a single example (there are others), in the "Introduction to the
> data.table package in R" document, page 6:
>
> [...] Since we do this a lot, there is an alias for data.table called J(),
>> short for join.
>>

I see what you mean, thanks for following up. J was an alias for
data.table at some point, maybe up to quite recently. Might have changed
in 1.8.0.

I'm leaning towards making J() work again as you expected it to, then.
Consistent with it being an alias for data.table, so documentation doesn't
need to change. With fresh eyes, it seems like a bug now.

If nobody on the list objects, please raise a bug report.

Thanks.

>
> Thanks for your help,
>
>   Christian
>




More information about the datatable-help mailing list