[datatable-help] How to speed up grouping time series, help please

Matthew Dowle mdowle at mdowle.plus.com
Tue Apr 5 14:19:57 CEST 2011


It's easier to help if you provide timings along with your example 
reproducible code, please.
How long is it taking, and how long do you think it should take?
Please also try to avoid phrases such as "without success". Does that mean 
you got an error
message (if so, what was it) or wrong result (if so, what was wrong)?
Matthew

"Daniele Amberti" <daniele.amberti at ors.it> wrote in message 
news:5C57984CA179A247803E12AAB0F7ABA6DB20979608 at adorsmail01.ors.local...
>I retrieve for a few hundred times a group of time series (10-15 ts with 
>10000 values each), on every group I do some calculation, graphs etc. I 
>wonder if there is a faster method than what presented below to get an 
>appropriate timeseries object.
>
> Making a query with RODBC for every group I get a data frame like this:
>
>> X
>  ID                DATE     VALUE
> 14  3 2000-01-01 00:00:03 0.5726334
> 4   1 2000-01-01 00:00:03 0.8830174
> 1   1 2000-01-01 00:00:00 0.2875775
> 15  3 2000-01-01 00:00:04 0.1029247
> 11  3 2000-01-01 00:00:00 0.9568333
> 9   2 2000-01-01 00:00:03 0.5514350
> 7   2 2000-01-01 00:00:01 0.5281055
> 6   2 2000-01-01 00:00:00 0.0455565
> 12  3 2000-01-01 00:00:01 0.4533342
> 8   2 2000-01-01 00:00:02 0.8924190
> 3   1 2000-01-01 00:00:02 0.4089769
> 13  3 2000-01-01 00:00:02 0.6775706
>
> And I want to get a timeSeries object or xts object like this:
>
>                           1         2         3
> 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333
> 2000-01-01 00:00:01        NA 0.5281055 0.4533342
> 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706
> 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334
> 2000-01-01 00:00:04        NA        NA 0.1029247
>
> Both classes accept a matrix so if I can create a matrix like the one 
> represented above and an array of characters representing dates faster 
> than what possible with xts:::merge, for example, I will have a faster 
> implementation, this is the reason why I'm writing to datatable-help; I 
> red vignettes, tests and did tests trying to generate a set of data.table 
> (using .SD and by = ID) an then CJ but without success up to now, any 
> input to test this approach will be really appreciate.
>
> Input data can be sorted or unsorted (the most complicated case is in the 
> example, unsorted and missing data) in the sense that I can  sort in query 
> if I can take an advantage from this.
>
> Below some code to generate the test case above.
>
> Thanks in advance for any input, best regards,
> Daniele
>
>
> set.seed(123)
> N <- 100 # number of observations, use 5 to replicate test case above
> K <- 3   # number of timeseries ID
>
> X <- data.frame(
> ID = rep(1:K, each = N),
> DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), 
> K)),
> VALUE = runif(N*K), stringsAsFactors = FALSE)
>
> X <- X[sample(1:(N*K), N*K),] # sample observations to get random order 
> (optional)
> X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),] # 20% missing
>
> head(X, 15)
>
>
> # an implementation in xts:
> xtsSplit <- function(x)
> {
> library(xts)
> x <- xts(x[,c("ID","VALUE")], as.POSIXct(x[,"DATE"]))
> x <- do.call(merge, split(x$VALUE,x$ID))
> return(x)
> }
>
> xtsSplitTime <- replicate(50,
> system.time(xtsSplit(X))[[1]])
> median(xtsTime)
>
>
> ORS Srl
>
> Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy
> Tel. +39 0173 620211
> Fax. +39 0173 620299 / +39 0173 433111
> Web Site www.ors.it
>
> ------------------------------------------------------------------------------------------------------------------------
> Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi 
> allegati è vietato e potrebbe costituire reato.
> Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se 
> provvedesse alla distruzione dello stesso
> e degli eventuali allegati.
> Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non 
> siano relative alle attività e/o
> alla missione aziendale di O.R.S. Srl si intendono non  attribuibili alla 
> società stessa, né la impegnano in alcun modo.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> 





More information about the datatable-help mailing list