[datatable-help] Factors may lose ordered class when used as keys

Allan Engelhardt allane at cybaea.com
Tue Jun 7 21:35:02 CEST 2011


On 07/06/11 19:35, Matthew Dowle wrote:
> The documentation could be improved, but ?setkey does say :
>
>    "The columns are sorted in ascending order always."

Yes, but my beef was not with the sort order but that the class of the 
column changes without warning.  I have no problems with the "A" value 
coming before "B" in the data.table but you can do that by sorting the 
levels() character information without changing the class of the 
column?  That is: I do not expect that X[1,A] < X[2,A] when A is an 
ordered factor; I expect that as.character(X[1,A]) < as.character(X[2,A]).

Allan

> More information in previous thread :
>
> http://r.789695.n4.nabble.com/Behavior-of-setkey-with-factors-tp2319612p2319612.html
>
>
> Matthew
>
>
>
> On Tue, 2011-06-07 at 07:50 +0100, Allan Engelhardt wrote:
>> Is it documented anywhere that factors may lose their ordered-ness when
>> used as keys?  E.g.
>>
>> library("data.table")
>> F<- factor(LETTERS[1:3], levels = rev(LETTERS), ordered = TRUE)
>> X<- data.table(A = F, B = F, key = "A")
>> str(X)                     # A is no longer ordered; B still is
>> stopifnot(is.ordered(X$B)) # OK
>> stopifnot(is.ordered(X$A)) # Fails!
>>
>> I can kind of see why it might happen, but it still caught me by
>> surprise, and if it ever happens on (some?) ad-hoc index lookups then it
>> will really cause bugs in my code....
>>
>> Allan
>> (Above is simpler version of example sent off-list to Matthew)
>>
>>   >  sessionInfo()
>> R version 2.13.0 (2011-04-13)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>>    [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C
>>    [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8
>>    [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8
>>    [7] LC_PAPER=en_GB.utf8       LC_NAME=C
>>    [9] LC_ADDRESS=C              LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] data.table_1.6 ctv_0.7-2
>>
>> loaded via a namespace (and not attached):
>> [1] tools_2.13.0
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>


More information about the datatable-help mailing list