[datatable-help] Speeding up column references with roll
Arunkumar Srinivasan
aragorn168b at gmail.com
Mon Jun 30 19:00:17 CEST 2014
Once again, has been fixed in 1.9.3. Now join requires `by=.EACHI` (explicit) to perform a by-without-by.
https://github.com/Rdatatable/data.table/blob/master/README.md
Have a look at the first FR (by = .EACHI runs ...) that's been fixed in 1.9.3 - there's some changes in the way join results in due to these changes (which've been discussed since and for quite sometime) to bring more consistency to the DT[i, j, by] syntax. Also have a look at the second FR and the links it points to for the discussions.
In general, it's better to test with the devel version (and have a look at README) for any bugs you may encounter.
Arun
From: Stavros Macrakis (Σταῦρος Μακράκης) macrakis at alum.mit.edu
Reply: Stavros Macrakis (Σταῦρος Μακράκης) macrakis at alum.mit.edu
Date: June 30, 2014 at 5:38:10 PM
To: datatable-help at r-forge.wu-wien.ac.at datatable-help at r-forge.wu-wien.ac.at
Subject: [datatable-help] Speeding up column references with roll
In the following example, it is about 15-25% faster to use setnames rather than j=list(name=var). Is there some better approach to referencing the other joined column when using roll?
# Use j=list(name=var)
calc1 <- function(d) {
d[ hit==1
][ d,list(hittime=time),roll=-20
][ !is.na(hittime)
]
}
# Use setnames
calc2 <- function(d) {
temp <- d[ hit==1
][ d,time,roll=-20
]
setnames(temp,3,"hittime")
temp[!is.na(hittime)]
}
# Generate sample data
set.seed(12312391)
data <- data.table(
group = sample(1e3,1e7,replace=T),
time = ceiling(runif(1e7, 0, 1e5)),
hit = rbinom(1e7, 1, p = 0.1),
key=c("group","time"))
# Timing
system.time(replicate(10,{gc();calc1(data)})) => 69 sec system.time(replicate(10,{gc();calc2(data)})) => 52 sec
_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140630/c690a8d3/attachment.html>
More information about the datatable-help
mailing list