[datatable-help] Performance observation
Alexandre Sieira
alexandre.sieira at gmail.com
Tue May 28 19:37:16 CEST 2013
I was working on some code today and encountered this scenario here where the performance behavior of data.table surprised me a little. Is this expected?
> dt = data.table(a=rnorm(1000000))
> system.time( for(i in 1:100000) j = dt[i, a] )
usuário sistema decorrido
78.064 0.426 78.034
> system.time( for(i in 1:100000) j = dt[i, "a", with=F] )
usuário sistema decorrido
27.814 0.154 27.810
> system.time( for(i in 1:100000) j = dt[["a"]][i] )
usuário sistema decorrido
1.227 0.006 1.225
(sorry about the output in portuguese)
Not knowing anything about how data.table is implemented internally, I would have assumed the three syntaxes for accessing the data.table should have similar or at the most a small difference in performance.
--
Alexandre Sieira
CISA, CISSP, ISO 27001 Lead Auditor
"The truth is rarely pure and never simple."
Oscar Wilde, The Importance of Being Earnest, 1895, Act I
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130528/ba4ef660/attachment.html>
More information about the datatable-help
mailing list