[datatable-help] new key argument to [.data.table in 1.8.11

Eduard Antonyan eduard.antonyan at gmail.com
Sun Sep 29 16:02:06 CEST 2013


Ah what am I thinking - you'll have to copy and still set a key, so unless
you have to go back to the old key (rarely?) this is strictly faster.


On Sun, Sep 29, 2013 at 8:47 AM, Eduard Antonyan
<eduard.antonyan at gmail.com>wrote:

> There wasn't a 'key' argument before and yes, it will change the key
> regardless of whether you're merging or not. Initially I added it just for
> the merges, but then realized that there us no conceptual reason to
> restrict it just to merges.
>
> Fyi the reason you probably thought there is a key argument before is
> because in R shorthand of arguments is valid syntax and you were actually
> using 'keyby' (which has not changed).
>
> You raise a good point that I haven't thought of that copying can be
> faster than sorting - I will check when that's true. It's easy to implement
> the copy version and I did this because I assumed it's the faster option,
> but if it's not then might as well copy and do this for merges only.
>  On Sep 28, 2013 11:50 PM, "Frank Erickson" <FErickson at psu.edu> wrote:
>
>> Hi,
>>
>> I'm just continuing a discussion with @eddi that would not fit in an SO
>> comment. If you want to catch up, the references are...
>>
>> http://r-forge.r-project.org/tracker/index.php?func=detail&aid=4675&group_id=240&atid=978
>> http://stackoverflow.com/a/19074195/1191259
>> The SO question (scroll up on the second link) was whether there was a
>> way to use a "temporary" key for X in an X[Y] join.
>>
>> @eddi:
>>
>> +1. Yeah, I like this new option and will probably use it.
>>
>> Will this also overwrite the key when using [.data.table without doing
>> joins? That might be backward incompatible I guess, since `key` is already
>> an argument to `[.data.table`. That is, will x[i,,key='B'] do anything? I
>> don't think that type of command has had much use until now, and adding a j
>> argument (that doesn't start with `:=`) always makes a copy (right?), so
>> maybe backward compatibility would not be an issue there.
>>
>> Regarding whether it's a reasonable compromise, ... well, I'll be using
>> it, anyway! I don't know what the feasibility constraints are on
>> implementing what I initially had in mind, so I'll defer to you and the
>> developers on that. If "secondary keys" are implemented down the road, that
>> would solve this problem in most cases.
>>
>> As far as when I will use it, I guess it depends on the relative cost of
>> making a copy vs resetting the key on x. If I use the old syntax, I make a
>> copy, but don't have to change x's key back at the end (one copy, one key
>> setting). With the new syntax, I'd have to change the key on x back
>> afterward (zero copies, two key settings). If I know the sorting takes a
>> long time (e.g., because the key is the whole set of columns), I might
>> still go with copying.
>>
>> Best,
>>
>> Frank
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130929/5f43cc25/attachment.html>


More information about the datatable-help mailing list