Dear Matthew,<br><br> I did do some research on the internet about,interpreting the statistics returned by time.<br><br>In this discussion [1],I read the following paragraph :-<br><br><code>User+Sys</code> will tell you how much actual CPU time your
process used. Note that this is across all CPUs, so if the process has
multiple threads it could potentially exceed the wall clock time
reported by <code>Real</code>. Note that in the output these figures include the <code>User</code> and <code>Sys</code>
time of all child processes as well, although the underlying system
calls return the statistics for the process and its children separately.<br><br><br><br>So when I compute user + sys for the following ( from my prev email ) 2 items <br><br>> rbind(time.data.table[[1]],<div class="im">
time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br></div>[1,] 0.008 0 0.005 0 0<br>[2,] 0.004 0 0.005 0 0<br><br>> rbind(time.data.table[[1]],<div class="im">
time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br></div>[1,] 0.004 0.004 0.005 0 0<br>[2,] 0.009 0.000 0.005 0 0<br><br>I see it "swinging" both ways. <br>
<br><br>[1] <a href="http://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1">http://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1</a><br>
<br>Thank you,<br>Ashim<br><br><div class="gmail_quote">On Tue, Nov 29, 2011 at 1:42 PM, Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com">mdowle@mdowle.plus.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<br>
I don't follow. The elapsed time is 0.005 seconds in all cases. The<br>
times are extremely small anyway (5ms), it seems to be just noise.<br>
<br>
We're used to seeing examples like the one in the examples section of<br>
help(":=") where 591s is reduced to 1.1s. A 500 times speedup. But, more<br>
importantly, where the wall clock time (10 minutes) is meaningful, worth<br>
saving, and (hopefully) the readers understand the saving scales; i.e.,<br>
10 minutes saving can easily be hours with larger data.<br>
<br>
We can talk on the 5ms scale, too, but you'll need to be much more<br>
precise and read up on the subject first, please.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On Tue, 2011-11-29 at 10:56 +0530, Ashim Kapoor wrote:<br>
> Dear Matthew,<br>
><br>
> Many thanks for your email.<br>
><br>
> Following your advice I split out the as.character(as.hexmode( )) and<br>
> ran it many times. The results swing both ways.<br>
><br>
><br>
><br>
> > library(xtable)<br>
> > library(data.table)<br>
> > start.size<-6e+5<br>
> ><br>
> > time.data.table<-list()<br>
> ><br>
> > for (i in 0:1){<br>
> + n<-start.size*10^i<br>
> + n1<-n/5000<br>
> +<br>
> my.data.table<-data.table(index=1:n,seriesname=rep(as.character(as.hexmode(1:n1)),each=5000),value=rnorm(n))<br>
> + setkey(my.data.table,"seriesname")<br>
> + searchitem<-as.character(as.hexmode(n1))<br>
> + time.data.table[[i+1]]<-system.time(my.data.table[J(searchitem)])<br>
> + }<br>
> ><br>
> > rbind(time.data.table[[1]],time.data.table[[2]])<br>
> user.self sys.self elapsed user.child sys.child<br>
> [1,] 0.008 0 0.005 0 0<br>
> [2,] 0.008 0 0.005 0 0<br>
><br>
> > rbind(time.data.table[[1]],time.data.table[[2]])<br>
> user.self sys.self elapsed user.child sys.child<br>
> [1,] 0.008 0 0.005 0 0<br>
> [2,] 0.004 0 0.005 0 0<br>
><br>
> > rbind(time.data.table[[1]],time.data.table[[2]])<br>
> user.self sys.self elapsed user.child sys.child<br>
> [1,] 0.004 0 0.005 0 0<br>
> [2,] 0.004 0 0.005 0 0<br>
><br>
> > rbind(time.data.table[[1]],time.data.table[[2]])<br>
> user.self sys.self elapsed user.child sys.child<br>
> [1,] 0.008 0 0.005 0 0<br>
> [2,] 0.008 0 0.005 0 0<br>
><br>
> > rbind(time.data.table[[1]],time.data.table[[2]])<br>
> user.self sys.self elapsed user.child sys.child<br>
> [1,] 0.004 0.004 0.005 0 0<br>
> [2,] 0.009 0.000 0.005 0 0<br>
><br>
> Thank you,<br>
> Ashim<br>
><br>
><br>
> On Mon, Nov 28, 2011 at 4:53 PM, Matthew Dowle<br>
> <<a href="mailto:mdowle@mdowle.plus.com">mdowle@mdowle.plus.com</a>> wrote:<br>
><br>
> Hi,<br>
> Welcome to the list. Quick first response..<br>
><br>
> Comparing differences of 4ms of single runs is not usually<br>
> very robust due<br>
> to overhead and cache effects. We usually prefer differences<br>
> of many<br>
> seconds or minutes and even then take the minimum of 3<br>
> repeated runs,<br>
> using something like packages rbenchmark or microbenchmark.<br>
><br>
> as.character(as.hexmode()) will install those strings in R's<br>
> global string<br>
> cache. The 2nd time will be faster as all those strings are<br>
> already<br>
> cached. Whether that explains this case I don't know, seems<br>
> plausible as<br>
> it's only 4ms. That part could be split out, repeated and<br>
> timed<br>
> separately.<br>
><br>
> Think a simpler example would be possible, too. I missed the<br>
> reason why<br>
> it's in a loop through 0:1 and for 4ms something like that<br>
> might be making<br>
> a tiny difference.<br>
><br>
> HTH, Matthew<br>
><br>
> > Dear all,<br>
> ><br>
> > Please see my reproducible example below. My question is why<br>
> does the 2nd<br>
> > table,which is bigger have a smaller access time ?<br>
> ><br>
> >> library(xtable)<br>
> >> library(data.table)<br>
> > data.table 1.7.2 For help type: help("data.table")<br>
> >> start.size<-6e+5<br>
> >><br>
> >> time.data.table<-list()<br>
> >><br>
> >> for (i in 0:1){<br>
> > + n<-start.size*10^i<br>
> > + n1<-n/5000<br>
> > +<br>
> ><br>
> my.data.table<-data.table(index=1:n,seriesname=rep(as.character(as.hexmode(1:n1)),each=5000),value=rnorm(n))<br>
> > + setkey(my.data.table,"seriesname")<br>
> > +<br>
> > time.data.table[[i<br>
> +1]]<-system.time(my.data.table[J(as.character(as.hexmode(n1/4))),])<br>
> > + }<br>
> ><br>
> >><br>
> >> rbind(time.data.table[[1]],time.data.table[[2]])<br>
> > user.self sys.self elapsed user.child sys.child<br>
> > [1,] 0.008 0 0.008 0 0<br>
> > [2,] 0.004 0 0.004 0 0<br>
> >> time.data.table[[1]]<br>
> > user system elapsed<br>
> > 0.008 0.000 0.008<br>
> >> time.data.table[[2]]<br>
> > user system elapsed<br>
> > 0.004 0.000 0.004<br>
> >><br>
> ><br>
> > Many thanks,<br>
> > Ashim<br>
><br>
> > _______________________________________________<br>
> > datatable-help mailing list<br>
> > <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
> ><br>
> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br>
><br>
><br>
><br>
<br>
<br>
</div></div></blockquote></div><br>