[datatable-help] Slow execution: Extracting last value in each group

Arunkumar Srinivasan aragorn168b at gmail.com
Fri Aug 16 12:37:21 CEST 2013


Frank, 
Is it a windows machine as well?
And could you try to use `debugonce` to find out the line(s) where it's slow?

Arun


On Friday, August 16, 2013 at 12:34 PM, Frank Erickson wrote:

> I get similar timings to arun, with the data.table call being a lot slower than the other timings. If data.table is not optimized for that .SD expression, perhaps that is okay because, as Arun pointed out, there are alternatives.. I can't guess why it would perform differently on different hardware, though...
> 
> # alternatives:
> a <- dt1[dt1[, .I[.N], by='Date']$V1]
> b <- dt1[J(unique(Date)),,mult='last'] # a little slower
> d <- dt1[, .SD[.N], by='Date'] # 600x slower; it would take ages to benchmark
> identical(a,b) # true
> identical(a,d) # false
> identical(as.data.frame(d),as.data.frame(a)) # true
> 
> --Frank
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130816/ea7f0e2b/attachment.html>


More information about the datatable-help mailing list