[Vegan-commits] r541 - in pkg: . R inst

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Sat Nov 1 08:55:10 CET 2008


Author: jarioksa
Date: 2008-11-01 08:55:10 +0100 (Sat, 01 Nov 2008)
New Revision: 541

Modified:
   pkg/DESCRIPTION
   pkg/R/nesteddisc.R
   pkg/inst/ChangeLog
Log:
nesteddisc: orders tied columns to minimize the statistic

Modified: pkg/DESCRIPTION
===================================================================
--- pkg/DESCRIPTION	2008-10-30 13:17:15 UTC (rev 540)
+++ pkg/DESCRIPTION	2008-11-01 07:55:10 UTC (rev 541)
@@ -1,7 +1,7 @@
 Package: vegan
 Title: Community Ecology Package
-Version: 1.16-3
-Date: October 27, 2008
+Version: 1.16-4
+Date: November 1, 2008
 Author: Jari Oksanen, Roeland Kindt, Pierre Legendre, Bob O'Hara, Gavin L. Simpson, 
    Peter Solymos, M. Henry H. Stevens, Helene Wagner  
 Maintainer: Jari Oksanen <jari.oksanen at oulu.fi>

Modified: pkg/R/nesteddisc.R
===================================================================
--- pkg/R/nesteddisc.R	2008-10-30 13:17:15 UTC (rev 540)
+++ pkg/R/nesteddisc.R	2008-11-01 07:55:10 UTC (rev 541)
@@ -1,12 +1,65 @@
-"nesteddisc" <-
-function(comm)
+`nesteddisc` <-
+    function(comm)
 {
+    ## The original discrepancy method orders columns by frequencies,
+    ## but does not consider ties. The current function tries to order
+    ## tied values to minimize the discrepancy either by complete
+    ## enumeration or with a larger number of ties using simulated
+    ## annealing.  NALL: max no. of tied items for NALL! complete
+    ## enumeration
+
+    ## starting values and CONSTANTS
+    NALL <- 7
+	NITER <- 5000
+    ties <- FALSE
+    trace <- FALSE
+    ## Code
     comm <- ifelse(comm > 0, 1, 0)
-    rs <- colSums(comm)
-    j <- rev(order(rs))
-    comm <- comm[, j] 
-    Ad <- sum(comm[col(comm) <= rowSums(comm)] == 0)
-    out <- list(statistic=Ad, ties = length(unique(rs)) < length(rs))
+    cs <- colSums(comm)
+    j <- rev(order(cs))
+    ## initial order
+    cs <- cs[j]
+    comm <- comm[, j]
+    ## run lengths: numbers of tied values
+    le <- rle(cs)$lengths
+    cle <- cumsum(le)
+    x <- seq(along=cs)
+    ## Range of row sums: only swaps between these have an effect
+    rs <- range(rowSums(comm))
+    ## Function to evaluate discrepancy
+    FUN <- function(x) sum(comm[col(comm)[,x] <= rowSums(comm)] == 0) 
+    ## Go through all le-items and permute ties
+    for (i in 1:length(le)) {
+        if (le[i] > 1) {
+            take <- x
+            idx <- (1:le[i]) + if(i == 1) 0 else  cle[i-1]
+            ## Can swaps influence discrepancy?
+            if (idx[1] > rs[2] || idx[le[i]] < rs[1])
+                next
+            Ad <- FUN(x)
+            if (le[i] <= NALL)
+                perm <- matrix(allPerms(le[i]), ncol=le[i]) + cle[i-1]
+            else {
+                ties <- TRUE
+                perm <- matrix(0, nrow=NITER, ncol=le[i])
+                for (j in 1:NITER)
+                    perm[j,] <- permuted.index2(le[i])
+                perm <- perm + if(i==1) 0 else  cle[i-1]
+            }
+            for (j in 1:nrow(perm)) {
+                take[idx] <- perm[j,]
+                val <- FUN(take)
+                if (val < Ad) {
+                    x <- take
+                    Ad <- val
+                    if (trace)
+                        cat(Ad, ":", perm[j,], "\n")
+                }
+            }
+        }
+    }
+    out <- list(statistic=Ad, ties = ties, order = x)
     class(out) <- "nesteddisc"
     out
 }
+

Modified: pkg/inst/ChangeLog
===================================================================
--- pkg/inst/ChangeLog	2008-10-30 13:17:15 UTC (rev 540)
+++ pkg/inst/ChangeLog	2008-11-01 07:55:10 UTC (rev 541)
@@ -2,8 +2,23 @@
 
 VEGAN DEVEL VERSIONS at http://r-forge.r-project.org/
 
-Version 1.16-3 (opened October 27, 2008)
+Version 1.16-4 (opened November 1, 2008)
 
+	* nesteddisc: replaced with a new function that orders tied
+	columns to minimize the discrepancy statistic. If there are <=7
+	tied columns, all upto 5040 permutations are inspected, and if
+	there are >7 tied columns, 5000 random permutations are
+	inspected. The ties are inspected only if they could change order
+	of columns within range of row sums (species richness). A warning
+	with results is only printed if there were >7 tied columns that
+	could influence the statistic, because in other cases the solution
+	is certainly correct. The new function may be slow, and a more
+	clever solution should be found (or nesteddisc dropped from
+	vegan).  With this code, the discrepancy statistic of the 'sipoo'
+	data is 50 (cf. notes on 1.16-3) instead of the original 55.
+
+Version 1.16-3 (closed November 1, 2008)
+
 	* spantree: saves labels of points, and 'cophenetic' and 'plot'
 	use these labels. Function 'plot' uses 'ordilabel' for text
 	labels.



More information about the Vegan-commits mailing list