[datatable-help] Slow execution: Extracting last value in each group

Arunkumar Srinivasan aragorn168b at gmail.com
Fri Aug 16 09:07:03 CEST 2013


Steve,
Thank you.

arun, 
Could you run it with `microbenchmark` instead of system.time (with times = 100 or so) and paste the results here?

Also, maybe you could use debugonce(data.table:::`[.data.table`) and then run 

    x[, .SD[.N], by='Date']

to go step by step to find out the line that causes the lag, perhaps? 


Arun


On Friday, August 16, 2013 at 9:01 AM, Steve Lianoglou wrote:

> Hi Arun,
> 
> On Thu, Aug 15, 2013 at 11:27 PM, Arunkumar Srinivasan
> <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > Sorry, but I'm not sure what your question is here. There seems to be
> > different timings between you and Steve. You want to get it verified as to
> > which one is true? On my system, Steve's takes 0.003 seconds.
> > 
> 
> 
> Actually, the issue was that (as far as I could tell) his code and my
> code are exactly the same, but it runs orders of magnitude slower on
> his machine than anywhere else I could test.
> 
> It doesn't make any sense -- perhaps I'm not looking close enough, but
> I suggested he send it here so more eyes could see it, because I'm
> stumped as to why/how that could happen.
> 
> > However, a *faster* version than Steve's solution (on bigger data) would be:
> > 
> > x[x[, .I[.N], by='Date']$V1]
> 
> Hah! Well done ;-)
> 
> -steve
> 
> -- 
> Steve Lianoglou
> Computational Biologist
> Bioinformatics and Computational Biology
> Genentech
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130816/b4b3dedf/attachment.html>


More information about the datatable-help mailing list