[datatable-help] datatable roll="next" takes 150 times longer than findInterval

Matt Dowle mdowle at mdowle.plus.com
Sun Feb 2 19:57:43 CET 2014


But this is at the *micro* second level ?!!

I confirm those results on my slow netbook but remember these are 
**micro** seconds i.e. 71,000 here is less than 0.1 of a second.

 > microbenchmark(flodel(X,Y), GG1(X,Y), GG2(X,Y))
Unit: microseconds
          expr       min        lq      median          uq max neval
  flodel(X, Y)   330.798   369.369    402.7935    455.3225 17996.26   100
     GG1(X, Y) 14287.380 14370.038  14466.5990  16010.5440 121082.77   100
     GG2(X, Y) 71164.270 85751.437 107951.3415 161676.5720 366003.62   100

To put it in some perspective :

 > system.time(GG2(X,Y))
    user  system elapsed
   0.072   0.000   0.072
 > system.time(GG2(X,Y))
    user  system elapsed
   0.080   0.000   0.079
 > system.time(GG2(X,Y))
    user  system elapsed
   0.072   0.000   0.072

Where those times are in seconds.   So the task in question here, takes 
0.07 seconds ?!

The 150x longer figure is actually (using figures from the S.O. answer)  
24695 microseconds (i.e. 0.024 seconds) divided by 168 microseconds 
(0.000168 seconds).  0.024 seconds / 0.000168 = "150 times".   If you 
rounded to milliseconds you could say data.table is infinitely slower  
(24ms / 0ms = Inf).

I can believe there's scope for improvement, sure,  but not from this 
benchmark. The vectors need to be *much* bigger and replications needs 
to be *much* smaller, say 3.   The task being timed needs to take a 
meaningful amount of time (say 5 seconds) *for a single run*.

Matt


On 02/02/14 12:27, Gabor Grothendieck wrote:
> The benchmark at the bottom of this post shows a problem where a 
> data.table roll="next" took nearly 150x longer than a base 
> findInterval() solution.  (The data.table solution is easier to write 
> though.) This suggests an area for possible speed improvement.
>
> http://stackoverflow.com/questions/21499742/fast-minimum-distance-interval-between-elements-of-2-logical-vectors-take-2/21500855#21500855
>
> -- 
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com <http://gmail.com>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140202/e58c3d48/attachment.html>


More information about the datatable-help mailing list