Dear Matthew,<br><br>Many thanks for your email.<br><br>Following your advice I split out the as.character(as.hexmode( )) and ran it many times. The results swing both ways.<br><br><br><br>> library(xtable)<br>> library(data.table)<br>
> start.size<-6e+5<br>> <br>> time.data.table<-list()<br>> <br>> for (i in 0:1){<br>+ n<-start.size*10^i<br>+ n1<-n/5000<br>+ my.data.table<-data.table(index=1:n,seriesname=rep(as.character(as.hexmode(1:n1)),each=5000),value=rnorm(n))<br>
+ setkey(my.data.table,"seriesname")<br>+ searchitem<-as.character(as.hexmode(n1))<br>+ time.data.table[[i+1]]<-system.time(my.data.table[J(searchitem)])<br>+ }<br>> <br>> rbind(time.data.table[[1]],time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br>[1,] 0.008 0 0.005 0 0<br>[2,] 0.008 0 0.005 0 0<br><br>> rbind(time.data.table[[1]],time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br>[1,] 0.008 0 0.005 0 0<br>[2,] 0.004 0 0.005 0 0<br><br>> rbind(time.data.table[[1]],time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br>[1,] 0.004 0 0.005 0 0<br>[2,] 0.004 0 0.005 0 0<br><br>> rbind(time.data.table[[1]],time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br>[1,] 0.008 0 0.005 0 0<br>[2,] 0.008 0 0.005 0 0<br><br>> rbind(time.data.table[[1]],time.data.table[[2]])<br>
user.self sys.self elapsed user.child sys.child<br>[1,] 0.004 0.004 0.005 0 0<br>[2,] 0.009 0.000 0.005 0 0<br><br>Thank you,<br>Ashim<br><br><br><div class="gmail_quote">
On Mon, Nov 28, 2011 at 4:53 PM, Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com">mdowle@mdowle.plus.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<br>
Hi,<br>
Welcome to the list. Quick first response..<br>
<br>
Comparing differences of 4ms of single runs is not usually very robust due<br>
to overhead and cache effects. We usually prefer differences of many<br>
seconds or minutes and even then take the minimum of 3 repeated runs,<br>
using something like packages rbenchmark or microbenchmark.<br>
<br>
as.character(as.hexmode()) will install those strings in R's global string<br>
cache. The 2nd time will be faster as all those strings are already<br>
cached. Whether that explains this case I don't know, seems plausible as<br>
it's only 4ms. That part could be split out, repeated and timed<br>
separately.<br>
<br>
Think a simpler example would be possible, too. I missed the reason why<br>
it's in a loop through 0:1 and for 4ms something like that might be making<br>
a tiny difference.<br>
<br>
HTH, Matthew<br>
<div><div class="h5"><br>
> Dear all,<br>
><br>
> Please see my reproducible example below. My question is why does the 2nd<br>
> table,which is bigger have a smaller access time ?<br>
><br>
>> library(xtable)<br>
>> library(data.table)<br>
> data.table 1.7.2 For help type: help("data.table")<br>
>> start.size<-6e+5<br>
>><br>
>> time.data.table<-list()<br>
>><br>
>> for (i in 0:1){<br>
> + n<-start.size*10^i<br>
> + n1<-n/5000<br>
> +<br>
> my.data.table<-data.table(index=1:n,seriesname=rep(as.character(as.hexmode(1:n1)),each=5000),value=rnorm(n))<br>
> + setkey(my.data.table,"seriesname")<br>
> +<br>
> time.data.table[[i+1]]<-system.time(my.data.table[J(as.character(as.hexmode(n1/4))),])<br>
> + }<br>
><br>
>><br>
>> rbind(time.data.table[[1]],time.data.table[[2]])<br>
> user.self sys.self elapsed user.child sys.child<br>
> [1,] 0.008 0 0.008 0 0<br>
> [2,] 0.004 0 0.004 0 0<br>
>> time.data.table[[1]]<br>
> user system elapsed<br>
> 0.008 0.000 0.008<br>
>> time.data.table[[2]]<br>
> user system elapsed<br>
> 0.004 0.000 0.004<br>
>><br>
><br>
> Many thanks,<br>
> Ashim<br>
</div></div>> _______________________________________________<br>
> datatable-help mailing list<br>
> <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br>
<br>
<br>
</blockquote></div><br>