[datatable-help] Automagic timezone convertion on data.table rbind

Matthew Dowle mdowle at mdowle.plus.com
Wed Sep 1 09:29:25 CEST 2010


Nicolas,

> I will impatiently await the integer date and time classes.

Definitely just install v1.5 from the R-Forge repo then. IDate isn't in
development in the sense it doesn't work or there are things left to do,
just in the sense not yet on CRAN. v1.5 is building cleanly on all
platforms and so if there's something in it you need, it's good to go.
See the NEWS file on the homepage for new features and bug fixes.

Matthew

On Tue, 2010-08-31 at 22:24 -0400, Nicolas Chapados wrote:
> Hi Tom,
> 
> 
> Thanks for your reply, and for the trick of forcibly establishing the
> timezone to GMT.
> 
> 
> I'm surprised of the behavior with c(), and can understand that
> data.table inherits it.  However, for the record, when rbinding
> data.frames (instead of data.tables), there is no loss of timezone
> info (this is what I'm currently doing in my code, building an
> intermediate data.frame, and then converting it wholesale to
> data.table when it's done -- timezones stay what they should be, even
> though this is not optimal).  So apparently, there are multiple
> instances of core R code doing date vector concatenation, and they
> disagree as to timezone handling...
> 
> 
> I will impatiently await the integer date and time classes.  (I
> currently have to manually set the storage.mode of a POSIXct vector to
> "integer" to be able to use them as keys in my data.tables.)
>  Moreover, I looked into chron and it's not suitable for a few things
> I need from POSIXct.
> 
> 
>     Best,
>     + Nicolas
> 
> 
> On Tue, Aug 31, 2010 at 10:05 PM, Short, Tom <TShort at epri.com> wrote:
>         Nicolas, 
>          
>         That's not really a data.table issue. It's generally a problem
>         with POSIXct. See here:
>          
>         > x = seq(as.POSIXct("2010-01-01", "GMT"),
>         as.POSIXct("2010-01-10", "GMT"), length.out=2)
>         > 
>         > c(x,x)
>         [1] "2009-12-31 19:00:00 EST" "2010-01-09 19:00:00 EST"
>         [3] "2009-12-31 19:00:00 EST" "2010-01-09 19:00:00 EST"
>         
>         To get around this, I generally fix the timezone for the
>         system as follows:
>          
>         > Sys.setenv(TZ="GMT") # time zones drive me crazy!
>         > 
>         > c(x,x)
>         [1] "2010-01-01 GMT" "2010-01-10 GMT" "2010-01-01 GMT"
>         "2010-01-10 GMT"
>         
>         Also, if you just have dates, it's better to stick with the
>         Date class (or chron). In the development version of
>         data.table, we also have integer date and time classes that
>         work well with data.table (faster sorting and grouping).
>          
>         - Tom
>          
>         
>                 
>                 ______________________________________________________
>                 From:
>                 datatable-help-bounces at lists.r-forge.r-project.org
>                 [mailto:datatable-help-bounces at lists.r-forge.r-project.org] On Behalf Of Nicolas Chapados
>                 Sent: Tuesday, August 31, 2010 15:40
>                 To: datatable-help at lists.r-forge.r-project.org
>                 Subject: [datatable-help] Automagic timezone
>                 convertion on data.table rbind
>                 
>                 
>                 
>                 
>                 Dear list, 
>                 
>                 
>                 I'm using a data.table that contains date columns (in
>                 POSIXct format).  There appears to be an issue with
>                 automatic timezone conversion when rbinding from NULL
>                 (or an empty data.table), as would occur when one is
>                 progressively building up the table in a loop.
>                 
>                 
>                 For example:
>                 
>                 
>                 > require(data.table)
>                 Loading required package: data.table
>                 
>                 
>                 ## Currently located in the North-America Eastern
>                 Standard Time zone.
>                 > Sys.timezone()
>                 [1] "America/Montreal"
>                 > a <- NULL
>                 
>                 
>                 ## Create a table with a single date column (GMT
>                 timezone)
>                 > b <- data.table(x = seq(as.POSIXct("2010-01-01",
>                 "GMT"), as.POSIXct("2010-01-10", "GMT"),
>                 length.out=10))
>                 > b
>                                x
>                  [1,] 2010-01-01
>                  [2,] 2010-01-02
>                  [3,] 2010-01-03
>                  [4,] 2010-01-04
>                  [5,] 2010-01-05
>                  [6,] 2010-01-06
>                  [7,] 2010-01-07
>                  [8,] 2010-01-08
>                  [9,] 2010-01-09
>                 [10,] 2010-01-10
>                 
>                 
>                 ## Bind it from NULL: Oops!  The timezone changes to
>                 EST!
>                 > rbind(a,b)
>                                         x
>                  [1,] 2009-12-31 19:00:00
>                  [2,] 2010-01-01 19:00:00
>                  [3,] 2010-01-02 19:00:00
>                  [4,] 2010-01-03 19:00:00
>                  [5,] 2010-01-04 19:00:00
>                  [6,] 2010-01-05 19:00:00
>                  [7,] 2010-01-06 19:00:00
>                  [8,] 2010-01-07 19:00:00
>                  [9,] 2010-01-08 19:00:00
>                 [10,] 2010-01-09 19:00:00
>                 
>                 
>                 ## Same behavior from an empty data table.
>                 > rbind(data.table(), b)
>                                         x
>                  [1,] 2009-12-31 19:00:00
>                  [2,] 2010-01-01 19:00:00
>                  [3,] 2010-01-02 19:00:00
>                  [4,] 2010-01-03 19:00:00
>                  [5,] 2010-01-04 19:00:00
>                  [6,] 2010-01-05 19:00:00
>                  [7,] 2010-01-06 19:00:00
>                  [8,] 2010-01-07 19:00:00
>                  [9,] 2010-01-08 19:00:00
>                 [10,] 2010-01-09 19:00:00
>                 
>                 
>                 This behavior is somewhat puzzling.  Any pointers as
>                 to how to preserve timezone information would be
>                 greatly appreciated!  For the record, I'm with R
>                 version 2.9.2 and time.date version 1.4.1 (the latest
>                 on CRAN).
>                 
>                 
>                 Best regards,
>                 + Nicolas Chapados
>                 
>                 
> 
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list