[datatable-help] Odd behavior with [.data.table and J

Matthew Dowle mdowle at mdowle.plus.com
Mon Sep 10 01:42:00 CEST 2012


>>
>> I see what you mean, thanks for following up. J was an alias for
>> data.table at some point, maybe up to quite recently. Might have changed
>> in 1.8.0.
>>
>> I'm leaning towards making J() work again as you expected it to, then.
>> Consistent with it being an alias for data.table, so documentation
>> doesn't
>> need to change. With fresh eyes, it seems like a bug now.
>>
>> If nobody on the list objects, please raise a bug report.
>>
>
> Here is the bug report:
> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2265&group_id=240&atid=975

Great, thanks.

>
> While we're talking about behavior that changed in recent versions, is
> there any reason why recent versions of data.table give a warning when
> used
> with factor columns (for joins)? I know these things are not a problem for
> people who use exclusively data.table everywhere, but we have a mix of
> data.table and data.frame objects in our code (for various reasons)... so
> we can't really convert to strictly string columns. Actually, I don't
> really see the reason for this warning, as everything works fine even with
> factor columns.

What's the warning? Joining factors to factors is different to non-factor
to factor, and factor to non-factor, in terms of warnings. Is it a warning
when joining character to factor?

> For data.table and data.frame interoperability, it would also be useful if
> setnames worked on data.frames.

Yes, but then there would be a possibility of breaking things.  data.table
has an .internal.selfref attribute that enables it to be updated by
reference in a way that allows it to be tracked. If setnames worked for
data.frame, then I'm not sure we could maintain compatibility. If two
symbols pointed to the same data.frame, then setnames would update both,
and that's contrary to traditional R behaviour for example.  When you use
data.table you know to expect that as you're moving away from traditional
R behaviour with regard to copies. I suppose we could allow it, and leave
it up to user to be careful. But is there a reason you can't use
data.table instead of data.frame?  data.table is() a data.frame as
data.table inherits from data.frame.

>
> Thanks,
>
>   Christian
>




More information about the datatable-help mailing list