[Vegan-commits] r1256 - in pkg/vegan: R inst man

Tue Aug 17 09:21:22 CEST 2010

Author: jarioksa
Date: 2010-08-17 09:21:22 +0200 (Tue, 17 Aug 2010)
New Revision: 1256

Modified:
   pkg/vegan/R/mrpp.R
   pkg/vegan/inst/ChangeLog
   pkg/vegan/man/mrpp.Rd
Log:
remove Classification Strength (CS) from mrpp

Modified: pkg/vegan/R/mrpp.R
===================================================================

--- pkg/vegan/R/mrpp.R	2010-08-13 15:37:58 UTC (rev 1255)
+++ pkg/vegan/R/mrpp.R	2010-08-17 07:21:22 UTC (rev 1256)
@@ -31,11 +31,10 @@
     del <- weighted.mean(classdel, w = w, na.rm = TRUE)
     E.del <- mean(dmat, na.rm = TRUE)
     ## 'Classification strength' if weight.type == 1
-    if (weight.type == 1) {
-        CS <- mean(dmat[outer(grouping, grouping, "!=")]) - del
-    } else {
-        CS <- NA
-    }
+    ## Do not calculate classification strength because there is no
+    ## significance test for it. Keep the item in reserve for
+    ## possible later re-inclusion.
+    CS <- NA
     if (missing(strata)) 
         strata <- NULL
     perms <- sapply(1:permutations, function(x) grouping[permuted.index(N, 

Modified: pkg/vegan/inst/ChangeLog
===================================================================
--- pkg/vegan/inst/ChangeLog	2010-08-13 15:37:58 UTC (rev 1255)
+++ pkg/vegan/inst/ChangeLog	2010-08-17 07:21:22 UTC (rev 1256)
@@ -4,14 +4,14 @@
 
 Version 1.18-9 (opened August 12, 2010)
 
-	* mrpp & meandist: I had misinterpreted the classification
-	strength (CS) be based on weight.type=3 or N*(N-1)/2, but it was
-	based on weight.type=1 or N. The literature reference to CS is now
-	more appropriate. Thanks to Dr John Van Sickle (US EPA, Corvallis,
-	Oregon) for pointing out that I had misread his papers. Needs
-	still checking. For instance, MRPP statistic and CS are not
-	monotonically related with this weight.type (they were with
-	weight.type=3), and CS cannot be tested with the current function.
+	* mrpp & meandist: John Van Sickle notified us that his
+	Classification Strength (CS) uses 'weight.type = 1' (or n)
+	insteaad of 'weight.type = 3' (or n(n-1)/2).  Calculation of CS
+	was dropped from mrpp(), because with this weighting it no longer
+	has an exact relation to the corresponding MRPP statistic and the
+	function mrpp() cannot provide a significance test for both
+	statistics together. CS is kept in meandist(), where its
+	calculation is corrected for the correct weight type.
 
 	* vegdist: Anderson et al. (Ecol Lett 9, 683-693; 2006) defined
 	their "alternative Gower" without range standardization of

Modified: pkg/vegan/man/mrpp.Rd
===================================================================
--- pkg/vegan/man/mrpp.Rd	2010-08-13 15:37:58 UTC (rev 1255)
+++ pkg/vegan/man/mrpp.Rd	2010-08-17 07:21:22 UTC (rev 1256)
@@ -6,12 +6,12 @@
 \alias{print.summary.meandist}
 \alias{plot.meandist}
 
-\title{ Multi Response Permutation Procedure of Within- versus
-  Among-Group Dissimilarities}
+\title{ Multi Response Permutation Procedure of Mean Dissimilarity Matrix}
 
 \description{ Multiple Response Permutation Procedure (MRPP) provides a
 test of whether there is a significant difference between two or more
-groups of sampling units.  }
+groups of sampling units. Function \code{meandist} finds the mean within
+and between block dissimilarities.}
 
 \usage{
 mrpp(dat, grouping, permutations = 999, distance = "euclidean",
@@ -65,9 +65,9 @@
 the average of the dissimilarities between two random collection of
 sampling units drawn from the entire population. 
 
-The mrpp statistic \eqn{\delta} is simply the overall weighted mean of
+The mrpp statistic \eqn{\delta} is the overall weighted mean of
 within-group means of the pairwise dissimilarities among sampling
-units. The correct choice of group weights is currently not clear. The
+units. The choice of group weights is currently not clear. The
 \code{mrpp} function offers three choices: (1) group size (\eqn{n}), (2) a
 degrees-of-freedom analogue (\eqn{n-1}), and (3) a weight that is the number
 of unique distances calculated among \eqn{n} sampling units (\eqn{n(n-1)/2}).
@@ -75,22 +75,14 @@
 The \code{mrpp} algorithm first calculates all pairwise distances in the
 entire dataset, then calculates \eqn{\delta}. It then permutes the
 sampling units and their associated pairwise distances, and recalculates
-a \eqn{\delta} based on the permuted data. It repeats the permutation
-step \code{permutations} times. The significance test is simply the
+\eqn{\delta} based on the permuted data. It repeats the permutation
+step \code{permutations} times. The significance test is the
 fraction of permuted deltas that are less than the observed delta, with
 a small sample correction. The function also calculates the
 change-corrected within-group agreement
 \eqn{A = 1 -\delta/E(\delta)}, where \eqn{E(\delta)} is the expected
-\eqn{\delta} assessed as the average of permutations.
+\eqn{\delta} assessed as the average of dissimilarities.
 
-With \code{weight.type = 1}, the function also calculates classification
-strength (Van Sickle and Hughes 2000) which is defined as the difference
-between average between group dissimilarities and within group
-dissimilarities (with weights \eqn{n}).  With \code{weight.type = 1} the
-classification strength is closely related to the MRPP statistic
-\eqn{A}, but not exactly, and the significance values from permutation
-tests only concern \eqn{A}.
-
 If the first argument \code{dat} can be interpreted as dissimilarities,
 they will be used directly. In other cases the function treats
 \code{dat} as observations, and uses \code{\link{vegdist}} to find the
@@ -103,19 +95,21 @@
 (off-diagonal elements), and an attribute \code{n} of \code{grouping}
 counts. Function \code{summary} finds the within-class, between-class
 and overall means of these dissimilarities, and the MRPP statistics with
-all \code{weight.type} options and the classification strength. The
-function does not allow significance tests for these statistics, but you
-must use \code{mrpp} with appropriate \code{weight.type}.  Function
-\code{plot} draws a dendrogram or a histogram of the result matrix based
-on the within-group and between group dissimilarities. The dendrogram is
-given with the \code{cluster} argument which is passed to
-\code{\link{hclust}}. The terminal segments hang to within-cluster
-dissimilarity. If some of the clusters is more heterogeneous than the
-combined class, the leaf segment is reversed.  The histogram is similar
-as the graphics used by Van Sickle and Hughes(1997), except that they
-are based on dissimilarities instead of similarities: horizontal line is
-drawn at the level of mean between-cluster dissimilarity and vertical
-lines connect within-cluster dissimilarities to this line.
+all \code{weight.type} options and the Classification Strength, CS (Van
+Sickle and Hughes, 2000).  CS is closely related to MRPP statistic with
+\code{weight.type = 1}. The function does not perform significance tests
+for these statistics, but you must use \code{mrpp} with appropriate
+\code{weight.type}, and there is currently no significance test for CS.
+Function \code{plot} draws a dendrogram or a histogram of the result
+matrix based on the within-group and between group dissimilarities. The
+dendrogram is found with the method given in the \code{cluster} argument
+using function \code{\link{hclust}}. The terminal segments hang to
+within-cluster dissimilarity. If some of the clusters are more
+heterogeneous than the combined class, the leaf segment are reversed.
+The histograms are based on dissimilarites, but ore otherwise similar to
+those of Van Sickle and Hughes (2000): horizontal line is drawn at the
+level of mean between-cluster dissimilarity and vertical lines connect
+within-cluster dissimilarities to this line.
 }
 
 \value{
@@ -124,8 +118,8 @@
   \item{delta }{The overall weighted mean of group mean distances.}
   \item{E.delta}{expected delta, under the null hypothesis of no group
     structure. This is the mean of original dissimilarities.}
-  \item{CS}{Classification strength (Van Sickle 1997) with
-    \code{weight.type = 3} and \code{NA} with other weights.}
+  \item{CS}{Classification strength (Van Sickle and Hughes,
+    2000). Currently not implemented and always \code{NA}.}
   \item{n}{Number of observations in each class.}
   \item{classdelta}{Mean dissimilarities within classes. The overall
     \eqn{\delta} is the weighted average of these values with given
@@ -167,11 +161,6 @@
 \code{mrpp} models can be analysed with \code{\link{adonis}} which seems
 not suffer from the same problems as \code{mrpp} and is a more robust
 alternative.
-
-Development version 1.18-8 and release version 1.17-3 and prior based
-classification strength (Van Sickle and Hughes, 2000) on
-\code{weight.type=3}, but the original paper clearly used
-\code{weight.type=1} (corrected on August 12, 2010).
 }
 \seealso{
   \code{\link{anosim}} for a similar test based on ranks, and
@@ -202,10 +191,11 @@
 )
 par(def.par)
 ## meandist
-dune.md <- meandist(vegdist(dune), dune.env$Management)
+dune.md <- with(dune.env, meandist(vegdist(dune), Management))
 dune.md
 summary(dune.md)
 plot(dune.md)
+plot(dune.md, kind="histogram")
 }
 \keyword{ multivariate }
 \keyword{ nonparametric }