[Vegan-commits] r1229 - in pkg/vegan: inst man

Tue Jun 15 11:16:36 CEST 2010

Author: gsimpson
Date: 2010-06-15 11:16:36 +0200 (Tue, 15 Jun 2010)
New Revision: 1229

Added:
   pkg/vegan/man/permutations.Rd
Modified:
   pkg/vegan/inst/ChangeLog
   pkg/vegan/man/adonis.Rd
   pkg/vegan/man/anosim.Rd
   pkg/vegan/man/envfit.Rd
   pkg/vegan/man/mantel.Rd
Log:
First attmempt to document the general form for permutation tests in Vegan

Modified: pkg/vegan/inst/ChangeLog
===================================================================

--- pkg/vegan/inst/ChangeLog	2010-06-13 06:32:37 UTC (rev 1228)
+++ pkg/vegan/inst/ChangeLog	2010-06-15 09:16:36 UTC (rev 1229)
@@ -6,7 +6,7 @@
 
 	* mantel, mantel.partial: considerable speed up by cleaning
 	innermost loop and replacing as.dist() with direct extraction of
-	lower diagonal. 
+	lower diagonal.
 
 	* treedist: documenting tree dissimilarity function that has been
 	in vegan devel since Aug 17, 2009 (rev928).
@@ -14,9 +14,12 @@
 	* betadisper: 'type = "median"', the default, was not computing
 	the spatial median on the real and imaginary axes separately.
 	Reported by Marek Omelka.
-	
-Version 1.18-5 (closed May 31, 2010) 
 
+	* permutations: First attempt to document within Vegan the general
+	workings of permutation tests. See ?permutations for details.
+
+Version 1.18-5 (closed May 31, 2010)
+
 	* cca, rda: plot() failed if Condition() had factors, but
 	constraints had no factors. An example of failure:
 

Modified: pkg/vegan/man/adonis.Rd
===================================================================
--- pkg/vegan/man/adonis.Rd	2010-06-13 06:32:37 UTC (rev 1228)
+++ pkg/vegan/man/adonis.Rd	2010-06-15 09:16:36 UTC (rev 1229)
@@ -93,7 +93,9 @@
 intact for a particular hypothesis test where one does not want to
 permute the data among particular groups. For instance, \code{strata =
 B} causes permutations among levels of \code{A} but retains data within
-levels of \code{B} (no permutation among levels of \code{B}).
+levels of \code{B} (no permutation among levels of \code{B}). See
+\code{\link{permutations}} for additional details on permutation tests
+in Vegan.
 
 The default \code{\link{contrasts}} are different than in \R in
 general. Specifically, they use \dQuote{sum} contrasts, sometimes known

Modified: pkg/vegan/man/anosim.Rd
===================================================================
--- pkg/vegan/man/anosim.Rd	2010-06-13 06:32:37 UTC (rev 1228)
+++ pkg/vegan/man/anosim.Rd	2010-06-15 09:16:36 UTC (rev 1229)
@@ -53,7 +53,9 @@
 
   The statistical significance of observed \eqn{R} is assessed by
   permuting the grouping vector to obtain the empirical
-  distribution of \eqn{R} under null-model.
+  distribution of \eqn{R} under null-model.  See
+  \code{\link{permutations}} for additional details on permutation tests
+  in Vegan.
 
   The function has \code{summary} and \code{plot} methods.  These both
   show valuable information to assess the validity of the method:  The

Modified: pkg/vegan/man/envfit.Rd
===================================================================
--- pkg/vegan/man/envfit.Rd	2010-06-13 06:32:37 UTC (rev 1228)
+++ pkg/vegan/man/envfit.Rd	2010-06-15 09:16:36 UTC (rev 1229)
@@ -100,7 +100,9 @@
   The goodness of fit statistic is squared correlation coefficient
   (\eqn{r^2}).
   For factors this is defined as \eqn{r^2 = 1 - ss_w/ss_t}, where
-  \eqn{ss_w} and \eqn{ss_t} are within-group and total sums of squares.
+  \eqn{ss_w} and \eqn{ss_t} are within-group and total sums of
+  squares. See \code{\link{permutations}} for additional details on
+  permutation tests in Vegan.
 
   User can supply a vector of prior  weights \code{w}. If the ordination
   object has weights, these will be used. In practise this means that

Modified: pkg/vegan/man/mantel.Rd
===================================================================
--- pkg/vegan/man/mantel.Rd	2010-06-13 06:32:37 UTC (rev 1228)
+++ pkg/vegan/man/mantel.Rd	2010-06-15 09:16:36 UTC (rev 1229)
@@ -34,7 +34,9 @@
   related).  However, the significance cannot be directly assessed,
   because there are \eqn{N(N-1)/2} entries for just \eqn{N} observations.
   Mantel developed asymptotic test, but here we use permutations of
-  \eqn{N} rows and columns of dissimilarity matrix.
+  \eqn{N} rows and columns of dissimilarity matrix.  See
+  \code{\link{permutations}} for additional details on permutation tests
+  in Vegan.
 
   Partial Mantel statistic uses partial correlation
   conditioned on the third matrix. Only the first matrix is permuted so

Added: pkg/vegan/man/permutations.Rd
===================================================================
--- pkg/vegan/man/permutations.Rd	                        (rev 0)
+++ pkg/vegan/man/permutations.Rd	2010-06-15 09:16:36 UTC (rev 1229)
@@ -0,0 +1,107 @@
+\name{permutations}
+\alias{permutations}
+
+\title{Permutation tests in Vegan}
+\description{
+  Unless stated otherwise, vegan currently provides for two types of
+  permutation test:
+  \enumerate{
+    \item{Free permutation of \emph{DATA}, also known as randomisation,
+      and}
+    \item{Free permutation of \emph{DATA} within the levels of a factor
+      variable.}
+  }
+  We use \emph{DATA} to mean either the observed data themselves or some
+  function of the data, for example the residuals of an ordination model
+  when covariables are present.
+  
+  The second type of permutation test above is available if the function
+  providing the test accepts an argument \code{strata} or passes
+  additional arguments (via \code{\dots}) to
+  \code{\link{permuted.index}}.
+
+  The Null hypothesis for these two types of permutation test assumes
+  free exchangeability of \emph{DATA} (within the levels of
+  \code{strata} if specified). Dependence between observations, such as
+  that which arises due to spatial or temporal autocorrelation, or
+  more-complicated experimental designs, such as split-plot designs,
+  violates this fundamental assumption of the test and requires restricted
+  permutation test designs. The next major version of Vegan will include
+  infrastructure to handle these more complicated permutation designs.
+
+  Again, unless otherwise stated in the help pages for specific
+  functions, permutation tests in Vegan all follow the same
+  format/structure:
+  \enumerate{
+    \item{An appropriate test statistic is chosen. Which statistic is
+      chosen should be described on the help pages for individual
+      functions.}
+    \item{The value of the test statistic is enumerated for the observed
+      data and analysis/model and recorded. Denote this value
+      \eqn{x_0}{x[0]}.}
+    \item{The \emph{DATA} are randomly permuted according to one of the
+      above two schemes, and the value of the test statistic for this
+      permutation is enumerated and recorded.}
+    \item{Step 3 is repeated a total of \eqn{n} times, where \eqn{n} is
+      the number of permutations requested. Denote these values as
+      \eqn{x_i}{x[i]}, where \eqn{i = 1, ..., n}{{i = 1, \ldots, n}.}}
+    \item{The values of the test statistic for the \eqn{n} permutations
+      of the \emph{DATA} are added to the value of the test statistic
+      for the observed data. These \emph{n + 1} values represent the
+      \emph{Null} or \emph{randomisation} distribution of the test
+      statistic. The observed value for the test statistic is included
+      in the Null distribution, because under the Null hypothesis being
+      tested, the observed value is just a common value of the test
+      statistic, no different from the values obtained via permutation
+      of \emph{DATA}.}
+    \item{The number of times that a value of the test statistic in the
+      Null distribution is equal to or greater than the value of the
+      test statistic for the observed data is recorded. Denote this
+      count as \eqn{N}.}
+    \item{The permutation p-value is computed as
+      \deqn{p = \frac{N}{n + 1}}{N / (n + 1)}}
+  }
+  The above description illustrates why the default number of
+  permutations specified in Vegan functions takes values of 199 or 999
+  for example. Once the observed value of the test statistic is added to
+  this number of random permutations of \emph{DATA}, pretty p-values are
+  achievable because \eqn{n + 1} becomes 200 or 1000, for example.
+
+  The minimum achievable p-value is
+  \deqn{p = \frac{1}{n +1}}{1 / (n + 1)}
+  However, one cannot simply increase the number of permutations
+  (\eqn{n}) to achieve a potentially lower p-value unless the number of
+  observations available permits such a number of permutations. This is
+  unlikely to be a problem for all but the smallest data set sizes when
+  free permutation (randomisation) is valid, but in designs where
+  \code{strata} is specified and there are a low number of observations
+  within each level of \code{strata}, there may not be as many actual
+  permutations of the data as you might want.
+
+  It is currently the responsibility of the user to determine the total
+  number of possible permutations for their \emph{DATA}. No checks are
+  made within Vegan functions to ensure a sensible number of
+  permutations is chosen.
+
+  Limits on the total number of permutations of \emph{DATA} are more
+  severe in temporally or spatially ordered data or experimental designs
+  with low replication. For example, a time series of \eqn{n = 100}
+  observations has just 100 possible permutations \strong{including} the
+  observed ordering.
+
+  In situations where only a low number of permutations is possible due
+  to the nature of \emph{DATA} or the experimental design, enumeration
+  of all permutations becomes important and achievable
+  computationally. Currently, Vegan does not include functions to
+  perform complete enumeration of the set of possible
+  permutations. The next major release of Vegan will include such
+  functionality, however.
+}
+
+\seealso{
+  \code{\link{permutest}}, \code{\link{permuted.index}}
+}
+%\references{
+%}
+\author{ Gavin Simpson }
+\keyword{multivariate}