[datatable-help] "Within" Reversed column order?

Matthew Dowle mdowle at mdowle.plus.com
Tue May 29 17:47:24 CEST 2012


> Thinking through this, the reasons I use within:
> 1) Avoid chained [:=]
> 2) Better semantics than transform (i.e, avoid commas, have one large
> expression block)
>
> I'm all for keeping consistency with data.frame, and I assume that's the
> real reason these functions are here.

Yes, and for historical reasons, before := and friends came along. Btw,
within's reverse order seems accidentally consistent with data.frame, not
deliberately ;)

> Perhaps this is really a case where we want an efficient multi-update
> instead of a properly ordered within?

Yes. Now that := by group is implemented in 1.8.1, the next step would be
to make it multiple := by group, which can't be done efficiently by
chaining anyway. So,

    DT[,{newcol1:=colA+colB
         newcol2:=newcol1*2
        },by=month(colC)]

Where the 2nd := can use the result from the first like that, too. Would
that syntax be ok?

> Or perhaps an overload on transform that allows {expr}.

Maybe. Kinda getting hooked on := now, though.

Matthew

>
> On 5/28/2012 6:09 AM, Matthew Dowle wrote:
>> Not sure why but it seems to be consistent with base data.frame :
>>
>>> DF = data.frame(a=1:3,b=4:6)
>>> within(DF,{c=a*b;d=a+b})
>>    a b d  c
>> 1 1 4 5  4
>> 2 2 5 7 10
>> 3 3 6 9 18
>> I don't mind it being raised as a bug in data.table and we'll fix it and
>> and add to FAQ 2.17 as another difference to data.frame. Or maybe it
>> would be clearer to remove within.data.table(). Do we need it?
>>
>> Matthew
>




More information about the datatable-help mailing list