[Vegan-commits] r2121 - pkg/vegan/man
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Wed Mar 21 10:36:05 CET 2012
Author: jarioksa
Date: 2012-03-21 10:36:05 +0100 (Wed, 21 Mar 2012)
New Revision: 2121
Modified:
pkg/vegan/man/anosim.Rd
pkg/vegan/man/mrpp.Rd
pkg/vegan/man/simper.Rd
Log:
Tell about location/dispersion mix-up in anosim, mrpp & simper
References to a new paper by Warton et al (MEE 3, 89-101; 2012)
to support previous warnings on the same issue in help pages.
Modified: pkg/vegan/man/anosim.Rd
===================================================================
--- pkg/vegan/man/anosim.Rd 2012-03-09 13:27:09 UTC (rev 2120)
+++ pkg/vegan/man/anosim.Rd 2012-03-21 09:36:05 UTC (rev 2121)
@@ -84,15 +84,24 @@
}
\references{
Clarke, K. R. (1993). Non-parametric multivariate analysis of changes
- in community structure. \emph{Australian Journal of Ecology} 18, 117-143.
+ in community structure. \emph{Australian Journal of Ecology} 18,
+ 117--143.
+
+ Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate
+ analyses confound location and dispersion effects. \emph{Methods in
+ Ecology and Evolution}, 3, 89--101
+
}
\author{Jari Oksanen, with a help from Peter R. Minchin.}
\note{
- I don't quite trust this method. Somebody should study its
- performance carefully. The function returns a lot of information
- to ease further scrutiny. Most \code{anosim} models could be analysed
- with \code{\link{adonis}} which seems to be a more robust alternative.
+ The \code{anosim} function can confound the differences between groups
+ and dispersion within groups and the results can be difficult to
+ interpret (cf. Warton et al. 2012). The function returns a lot of
+ information to ease studying its performance. Most \code{anosim}
+ models could be analysed with \code{\link{adonis}} which seems to be a
+ more robust alternative.
+
}
\seealso{\code{\link{mrpp}} for a similar function using original
Modified: pkg/vegan/man/mrpp.Rd
===================================================================
--- pkg/vegan/man/mrpp.Rd 2012-03-09 13:27:09 UTC (rev 2120)
+++ pkg/vegan/man/mrpp.Rd 2012-03-21 09:36:05 UTC (rev 2121)
@@ -54,72 +54,75 @@
\item{\dots}{Further arguments passed to functions.}
}
-\details{ Multiple Response Permutation Procedure (MRPP) provides a test
-of whether there is a significant difference between two or more groups
-of sampling units. This difference may be one of location (differences
-in mean) or one of spread (differences in within-group
-distance). Function \code{mrpp} operates on a \code{data.frame} matrix
-where rows are observations and responses data matrix. The response(s)
-may be uni- or multivariate. The method is philosophically and
-mathematically allied with analysis of variance, in that it compares
-dissimilarities within and among groups. If two groups of sampling units
-are really different (e.g. in their species composition), then average
-of the within-group compositional dissimilarities ought to be less than
-the average of the dissimilarities between two random collection of
-sampling units drawn from the entire population.
+\details{
-The mrpp statistic \eqn{\delta} is the overall weighted mean of
-within-group means of the pairwise dissimilarities among sampling
-units. The choice of group weights is currently not clear. The
-\code{mrpp} function offers three choices: (1) group size (\eqn{n}), (2) a
-degrees-of-freedom analogue (\eqn{n-1}), and (3) a weight that is the number
-of unique distances calculated among \eqn{n} sampling units (\eqn{n(n-1)/2}).
+ Multiple Response Permutation Procedure (MRPP) provides a test of
+ whether there is a significant difference between two or more groups
+ of sampling units. This difference may be one of location (differences
+ in mean) or one of spread (differences in within-group distance;
+ cf. Warton et al. 2012). Function \code{mrpp} operates on a
+ \code{data.frame} matrix where rows are observations and responses
+ data matrix. The response(s) may be uni- or multivariate. The method
+ is philosophically and mathematically allied with analysis of
+ variance, in that it compares dissimilarities within and among
+ groups. If two groups of sampling units are really different (e.g. in
+ their species composition), then average of the within-group
+ compositional dissimilarities ought to be less than the average of the
+ dissimilarities between two random collection of sampling units drawn
+ from the entire population.
-The \code{mrpp} algorithm first calculates all pairwise distances in the
-entire dataset, then calculates \eqn{\delta}. It then permutes the
-sampling units and their associated pairwise distances, and recalculates
-\eqn{\delta} based on the permuted data. It repeats the permutation
-step \code{permutations} times. The significance test is the
-fraction of permuted deltas that are less than the observed delta, with
-a small sample correction. The function also calculates the
-change-corrected within-group agreement
-\eqn{A = 1 -\delta/E(\delta)}, where \eqn{E(\delta)} is the expected
-\eqn{\delta} assessed as the average of dissimilarities.
+ The mrpp statistic \eqn{\delta} is the overall weighted mean of
+ within-group means of the pairwise dissimilarities among sampling
+ units. The choice of group weights is currently not clear. The
+ \code{mrpp} function offers three choices: (1) group size (\eqn{n}),
+ (2) a degrees-of-freedom analogue (\eqn{n-1}), and (3) a weight that
+ is the number of unique distances calculated among \eqn{n} sampling
+ units (\eqn{n(n-1)/2}).
-If the first argument \code{dat} can be interpreted as dissimilarities,
-they will be used directly. In other cases the function treats
-\code{dat} as observations, and uses \code{\link{vegdist}} to find the
-dissimilarities. The default \code{distance} is Euclidean as in the
-traditional use of the method, but other dissimilarities in
-\code{\link{vegdist}} also are available.
+ The \code{mrpp} algorithm first calculates all pairwise distances in
+ the entire dataset, then calculates \eqn{\delta}. It then permutes the
+ sampling units and their associated pairwise distances, and
+ recalculates \eqn{\delta} based on the permuted data. It repeats the
+ permutation step \code{permutations} times. The significance test is
+ the fraction of permuted deltas that are less than the observed delta,
+ with a small sample correction. The function also calculates the
+ change-corrected within-group agreement \eqn{A = 1 -\delta/E(\delta)},
+ where \eqn{E(\delta)} is the expected \eqn{\delta} assessed as the
+ average of dissimilarities.
-Function \code{meandist} calculates a matrix of mean within-cluster
-dissimilarities (diagonal) and between-cluster dissimilarities
-(off-diagonal elements), and an attribute \code{n} of \code{grouping}
-counts. Function \code{summary} finds the within-class, between-class
-and overall means of these dissimilarities, and the MRPP statistics with
-all \code{weight.type} options and the Classification Strength, CS (Van
-Sickle and Hughes, 2000). CS is defined for dissimiliraties as
-\eqn{\bar{B} - \bar{W}}{Bbar-Wbar}, where \eqn{\bar{B}}{Bbar} is the
-mean between cluster dissimilarity and \eqn{\bar{W}}{Wbar} is the mean
-within cluster dissimilarity with \code{weight.type = 1}. The function
-does not perform significance tests for these statistics, but you must
-use \code{mrpp} with appropriate \code{weight.type}. There is currently
-no significance test for CS, but \code{mrpp} with \code{weight.type = 1}
-gives the correct test for \eqn{\bar{W}}{Wbar} and a good approximation
-for CS. Function \code{plot} draws a dendrogram or a histogram of the
-result matrix based on the within-group and between group
-dissimilarities. The dendrogram is found with the method given in the
-\code{cluster} argument using function \code{\link{hclust}}. The
-terminal segments hang to within-cluster dissimilarity. If some of the
-clusters are more heterogeneous than the combined class, the leaf
-segment are reversed. The histograms are based on dissimilarites, but
-ore otherwise similar to those of Van Sickle and Hughes (2000):
-horizontal line is drawn at the level of mean between-cluster
-dissimilarity and vertical lines connect within-cluster dissimilarities
-to this line.
-}
+ If the first argument \code{dat} can be interpreted as
+ dissimilarities, they will be used directly. In other cases the
+ function treats \code{dat} as observations, and uses
+ \code{\link{vegdist}} to find the dissimilarities. The default
+ \code{distance} is Euclidean as in the traditional use of the method,
+ but other dissimilarities in \code{\link{vegdist}} also are available.
+ Function \code{meandist} calculates a matrix of mean within-cluster
+ dissimilarities (diagonal) and between-cluster dissimilarities
+ (off-diagonal elements), and an attribute \code{n} of \code{grouping}
+ counts. Function \code{summary} finds the within-class, between-class
+ and overall means of these dissimilarities, and the MRPP statistics
+ with all \code{weight.type} options and the Classification Strength,
+ CS (Van Sickle and Hughes, 2000). CS is defined for dissimiliraties as
+ \eqn{\bar{B} - \bar{W}}{Bbar-Wbar}, where \eqn{\bar{B}}{Bbar} is the
+ mean between cluster dissimilarity and \eqn{\bar{W}}{Wbar} is the mean
+ within cluster dissimilarity with \code{weight.type = 1}. The function
+ does not perform significance tests for these statistics, but you must
+ use \code{mrpp} with appropriate \code{weight.type}. There is
+ currently no significance test for CS, but \code{mrpp} with
+ \code{weight.type = 1} gives the correct test for \eqn{\bar{W}}{Wbar}
+ and a good approximation for CS. Function \code{plot} draws a
+ dendrogram or a histogram of the result matrix based on the
+ within-group and between group dissimilarities. The dendrogram is
+ found with the method given in the \code{cluster} argument using
+ function \code{\link{hclust}}. The terminal segments hang to
+ within-cluster dissimilarity. If some of the clusters are more
+ heterogeneous than the combined class, the leaf segment are reversed.
+ The histograms are based on dissimilarites, but ore otherwise similar
+ to those of Van Sickle and Hughes (2000): horizontal line is drawn at
+ the level of mean between-cluster dissimilarity and vertical lines
+ connect within-cluster dissimilarities to this line. }
+
\value{
The function returns a list of class mrpp with following items:
\item{call }{ Function call.}
@@ -147,7 +150,6 @@
B. McCune and J. B. Grace. 2002. \emph{Analysis of Ecological
Communities.} MjM Software Design, Gleneden Beach, Oregon, USA.
-
P. W. Mielke and K. J. Berry. 2001. \emph{Permutation Methods: A
Distance Function Approach.} Springer Series in
Statistics. Springer.
@@ -156,6 +158,9 @@
ecoregions, catchments, and geographic clusters of aquatic vertebrates
in Oregon. \emph{J. N. Am. Benthol. Soc.} 19:370--384.
+ Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate
+ analyses confound location and dispersion effects. \emph{Methods in
+ Ecology and Evolution}, 3, 89--101
}
\author{
Modified: pkg/vegan/man/simper.Rd
===================================================================
--- pkg/vegan/man/simper.Rd 2012-03-09 13:27:09 UTC (rev 2120)
+++ pkg/vegan/man/simper.Rd 2012-03-21 09:36:05 UTC (rev 2121)
@@ -60,6 +60,15 @@
the data frames also include the cumulative contributions and
are ordered by species contribution.
+ The results of \code{simper} can be very difficult to interpret. The
+ method very badly confounds the mean between group differences and
+ within group variation, and seems to single out variable species
+ instead of distinctive species (Warton et al. 2012). Even if you make
+ groups that are copies of each other, the method will single out
+ species with high contribution, but these are not contributions
+ to non-existing between-group differences but to within-group
+ variation in species abundance.
+
}
\value{
@@ -92,6 +101,10 @@
Clarke, K.R. 1993. Non-parametric multivariate analyses of changes
in community structure. \emph{Australian Journal of Ecology}, 18,
117–143.
+
+ Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate
+ analyses confound location and dispersion effects. \emph{Methods in
+ Ecology and Evolution}, 3, 89--101.
}
\keyword{multivariate}
More information about the Vegan-commits
mailing list