[datatable-help] Is there a good way to do a self join in one line?

Chris Neff caneff at gmail.com
Thu Aug 11 14:53:32 CEST 2011


To provide a workaround for now until grouping really happens, I've found that

DT[,z:=ave(x, y, FUN=sum)]

to be a reasonable alternative to

DT[, z:=sum(x), by=y]

until the second way is supported of course.

On 3 August 2011 09:17, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>
> There is an example in ?data.table (admittedly one line) :
>
>    DT[,transform(.SD,z=sum(x)),by=y]
>
> But, see point 2 on wiki :
>
>    http://rwiki.sciviews.org/doku.php?id=packages:cran:data.table
>
> Maybe you could rep(sum(x),.N) by group, then cbind afterwards.
>
> All that may be quite "hard" so the future idiom for this will simply be :
>
>    DT[,z:=sum(x),by=y]
>
> := is implemented in 1.6.3 but only in combination with i so far. See NEWS
> for 1.6.3. := isn't implemented when grouping (yet).  So, (I think) you're
> stuck with cbind for the moment if speed is important as per wiki example,
> otherwise transform .SD in j.
>
> Matthew
>
> "Chris Neff" <caneff at gmail.com> wrote in message
> news:CAAuY0RV=Ot2XkVopihk6WmsyMrfcs7hrKtuXMLktiW1nrcRMdw at mail.gmail.com...
>> Say I want to calculate an aggregate statistic and append it to the
>> data frame all in one move. Like this:
>>
>> DT <- data.table(x= 1:10, y=rep(1:2,each=5))
>>
>> DT <- DT[, list(x, z=sum(x)), by=y]
>>
>> This will append the new variable z to the data frame. But what if I
>> have a lot of columns, and I don't want to address them by name like I
>> did there? I'd like to do something like:
>>
>>
>> DT <- DT[, list(names(DT), z=sum(x)), by=y]
>>
>> but that won't work because names(DT) is a character vector not the
>> parts of the list expression I want. I mean there is the following:
>>
>> tmp <- DT[,list(z=sum(x)), by=y]
>>
>> DT <- DT[tmp]
>>
>> but creating a temporary variable is annoying.  This doesn't work:
>>
>> DT <- DT[DT[, list(z=sum(x)), by=y]]
>>
>> Thoughts?
>>
>> Chris
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>


More information about the datatable-help mailing list