[datatable-help] Factors may lose ordered class when used as keys

Matthew Dowle mdowle at mdowle.plus.com
Tue Jun 7 20:35:22 CEST 2011


The documentation could be improved, but ?setkey does say :

  "The columns are sorted in ascending order always."

More information in previous thread :

http://r.789695.n4.nabble.com/Behavior-of-setkey-with-factors-tp2319612p2319612.html


Matthew



On Tue, 2011-06-07 at 07:50 +0100, Allan Engelhardt wrote:
> Is it documented anywhere that factors may lose their ordered-ness when 
> used as keys?  E.g.
> 
> library("data.table")
> F <- factor(LETTERS[1:3], levels = rev(LETTERS), ordered = TRUE)
> X<- data.table(A = F, B = F, key = "A")
> str(X)                     # A is no longer ordered; B still is
> stopifnot(is.ordered(X$B)) # OK
> stopifnot(is.ordered(X$A)) # Fails!
> 
> I can kind of see why it might happen, but it still caught me by 
> surprise, and if it ever happens on (some?) ad-hoc index lookups then it 
> will really cause bugs in my code....
> 
> Allan
> (Above is simpler version of example sent off-list to Matthew)
> 
>  > sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> 
> locale:
>   [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C
>   [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8
>   [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8
>   [7] LC_PAPER=en_GB.utf8       LC_NAME=C
>   [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] data.table_1.6 ctv_0.7-2
> 
> loaded via a namespace (and not attached):
> [1] tools_2.13.0
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list