[datatable-help] possible bug when running setkey on POSIXct column

Matthew Dowle mdowle at mdowle.plus.com
Sun Jan 20 17:13:53 CET 2013


Indeed. I reproduced. Now fixed. Well done and thanks statquant.

o setkey could sort 'double' columns (such as POSIXct) incorrectly when 
not the
   last column of the key, #2484. In data.table's C code :
      x[a]>x[b]-tol
   should have been :
      x[a]-x[b]>-tol  [or  x[b]-x[a]<tol ]
   The difference may have been machine/compiler dependent. Many thanks 
to statquant
   for the short reproducible example. Test added.


On 18.01.2013 21:48, Erik Iverson wrote:
> Jim, what platform are you on? I can reproduce on Linux (and indeed
> recently saw something similar with my own data).
>
> Best,
> --Erik
>
> On Fri, Jan 18, 2013 at 3:45 PM, jim holtman <jholtman at gmail.com> 
> wrote:
>>
>> seems to work for me:
>>
>> > library(data.table)
>> data.table 1.8.6  For help type: help("data.table")
>> > DT = data.table(X=as.POSIXct(
>> > 
>> c(rep("15DEC2008:00:00:00",10),"15DEC2008:00:00:00",rep("17DEC2008:00:00:00",2)),format="%d%b%Y:%H:%M:%S"),Y=c(1534,61,74,518,519,1519,1520,1524,3127,29250,30609,43,7853))
>> > setkey(DT,X,Y)
>> > DT
>>              X     Y
>>  1: 2008-12-15    61
>>  2: 2008-12-15    74
>>  3: 2008-12-15   518
>>  4: 2008-12-15   519
>>  5: 2008-12-15  1519
>>  6: 2008-12-15  1520
>>  7: 2008-12-15  1524
>>  8: 2008-12-15  1534
>>  9: 2008-12-15  3127
>> 10: 2008-12-15 29250
>> 11: 2008-12-15 30609
>> 12: 2008-12-17    43
>> 13: 2008-12-17  7853
>>
>>
>> On Fri, Jan 18, 2013 at 9:44 AM, statquant <statquant at outlook.com> 
>> wrote:
>> > Hello I might have found a bug.
>> > I really cannot explain the following (please be indulgent as I 
>> narrowed
>> > the
>> > most I could).
>> >
>> > library(data.table)
>> > DT = data.table(X=as.POSIXct(
>> >
>> > 
>> c(rep("15DEC2008:00:00:00",10),"15DEC2008:00:00:00",rep("17DEC2008:00:00:00",2)),format="%d%b%Y:%H:%M:%S"),Y=c(1534,61,74,518,519,1519,1520,1524,3127,29250,30609,43,7853))
>> > setkey(DT,X,Y)
>> >
>> > #Here is what I see after the sort
>> >
>> > DT
>> >              X     Y
>> >  1: 2008-12-15  1534
>> >  2: 2008-12-15    61
>> >  3: 2008-12-15    74
>> >  4: 2008-12-15   518
>> >  5: 2008-12-15   519
>> >  6: 2008-12-15  1519
>> >  7: 2008-12-15  1520
>> >  8: 2008-12-15  1524
>> >  9: 2008-12-15  3127
>> > 10: 2008-12-15 29250
>> > 11: 2008-12-15 30609
>> > 12: 2008-12-17    43
>> > 13: 2008-12-17  7853
>> >
>> > #I thought that it was a POSIXct problem but if I can get the 
>> correct
>> > answer
>> > like this:
>> >
>> > DT[order(X,Y),]
>> >              X     Y
>> >  1: 2008-12-15    61
>> >  2: 2008-12-15    62
>> >  3: 2008-12-15    74
>> >  4: 2008-12-15   518
>> >  5: 2008-12-15   519
>> >  6: 2008-12-15  1519
>> >  7: 2008-12-15  1520
>> >  8: 2008-12-15  1524
>> >  9: 2008-12-15  3127
>> > 10: 2008-12-15 29250
>> > 11: 2008-12-15 30609
>> > 12: 2008-12-17    43
>> > 13: 2008-12-17  7853
>> >
>> > #Here is my session (just launched it)
>> >
>> > R> sessionInfo()
>> > R version 2.15.2 (2012-10-26)
>> > Platform: x86_64-pc-linux-gnu (64-bit)
>> >
>> > locale:
>> >  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               
>> LC_TIME=C
>> >  [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=fr_FR.UTF-8
>> > LC_MESSAGES=en_GB.UTF-8
>> >  [7] LC_PAPER=C                 LC_NAME=C                  
>> LC_ADDRESS=C
>> > [10] LC_TELEPHONE=C             LC_MEASUREMENT=fr_FR.UTF-8
>> > LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] stats     graphics  grDevices datasets  utils     methods   
>> base
>> >
>> > other attached packages:
>> > [1] data.table_1.8.7 inline_0.3.10    Rcpp_0.10.2      
>> vimcom_0.9-5
>> > setwidth_1.0-2
>> > [6] colorout_0.9-9
>> >
>> >
>> >
>> > _______________________________________________
>> > datatable-help mailing list
>> > datatable-help at lists.r-forge.r-project.org
>> >
>> > 
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>
>>
>> --
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> 
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> 
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


More information about the datatable-help mailing list