[Vegan-commits] r1249 - in pkg/vegan: R inst man

Thu Aug 12 18:24:01 CEST 2010

Author: jarioksa
Date: 2010-08-12 18:24:01 +0200 (Thu, 12 Aug 2010)
New Revision: 1249

Modified:
   pkg/vegan/R/vegdist.R
   pkg/vegan/inst/ChangeLog
   pkg/vegan/man/vegdist.Rd
Log:
altGower of Anderson et al. was without range standardization, closes bug #1002

Modified: pkg/vegan/R/vegdist.R
===================================================================

--- pkg/vegan/R/vegdist.R	2010-08-12 12:10:36 UTC (rev 1248)
+++ pkg/vegan/R/vegdist.R	2010-08-12 16:24:01 UTC (rev 1249)
@@ -20,7 +20,7 @@
         warning("results may be meaningless because data have negative entries in method ", inm,"\n")
     if (method == 11 && any(colSums(x) == 0)) 
         warning("data have empty species which influence the results im method ", inm, "\n")
-    if (method %in% c(6, 14)) 
+    if (method == 6) # gower, but no altGower
         x <- decostand(x, "range", 2, na.rm = TRUE, ...)
     if (binary) 
         x <- decostand(x, "pa")

Modified: pkg/vegan/inst/ChangeLog
===================================================================
--- pkg/vegan/inst/ChangeLog	2010-08-12 12:10:36 UTC (rev 1248)
+++ pkg/vegan/inst/ChangeLog	2010-08-12 16:24:01 UTC (rev 1249)
@@ -2,7 +2,7 @@
 
 VEGAN DEVEL VERSIONS at http://r-forge.r-project.org/
 
-Version 1.18-9 (opened August 12, 2010)q
+Version 1.18-9 (opened August 12, 2010)
 
 	* mrpp & meandist: I had misinterpreted the classification
 	strength (CS) be based on weight.type=3 or N*(N-1)/2, but it was
@@ -13,6 +13,11 @@
 	monotonically related with this weight.type (they were with
 	weight.type=3), and CS cannot be tested with the current function.
 
+	* vegdist: Anderson et al. (Ecol Lett 9, 683-693; 2006) defined
+	their "alternative Gower" without range standardization of
+	columns.  Reported as bug #1002 in http://r-forge.r-project.org/
+	by Sergio Garcia. Also some small edits of vegdist man page.
+
 Version 1.18-8 (closed August 12, 2010)
 
 	* DESCRIPTION: does not suggest package 'ellipse'.

Modified: pkg/vegan/man/vegdist.Rd
===================================================================
--- pkg/vegan/man/vegdist.Rd	2010-08-12 12:10:36 UTC (rev 1248)
+++ pkg/vegan/man/vegdist.Rd	2010-08-12 16:24:01 UTC (rev 1249)
@@ -55,12 +55,12 @@
     \tab \eqn{d_{jk} = (1/M) \sum_i \frac{|x_{ij}-x_{ik}|}{\max x_i-\min
 	x_i}}{d[jk] = (1/M) sum (abs(x[ij]-x[ik])/(max(x[i])-min(x[i]))}
     \cr
-    \tab where \eqn{M} is the number of columns (excluding missing
+    \tab where \eqn{M} is the number of rows (excluding missing
     values)
     \cr
     \code{altGower}
-    \tab like \code{gower}, but \eqn{M} is the number of pairs with at least
-    one non-zero value (excluding double zeros)
+    \tab like \code{manhattan}, but divided by the number of rows
+    excluding double-zeros (Anderson et al. 2006).
     \cr
     \code{canberra}
     \tab \eqn{d_{jk}=\frac{1}{NZ} \sum_i
@@ -110,28 +110,27 @@
   limit, but can vary among sites with no shared species. For further
   discussion, see Anderson & Millar (2004).
   
-  Mountford index is defined as \eqn{M = 1/\alpha} where \eqn{\alpha} is
-  the parameter of Fisher's logseries assuming that the compared
+  Mountford index is defined as \eqn{M = 1/\alpha} where \eqn{\alpha}
+  is the parameter of Fisher's logseries assuming that the compared
   communities are samples from the same community
   (cf. \code{\link{fisherfit}}, \code{\link{fisher.alpha}}). The index
   \eqn{M} is found as the positive root of equation \eqn{\exp(aM) +
   \exp(bM) = 1 + \exp[(a+b-j)M]}{exp(a*M) + exp(b*M) = 1 +
   exp((a+b-j)*M)}, where \eqn{j} is the number of species occurring in
-  both communities, and \eqn{a} and \eqn{b} are the number of species in
-  each separate community (so the index uses presence--absence
+  both communities, and \eqn{a} and \eqn{b} are the number of species
+  in each separate community (so the index uses presence--absence
   information). Mountford index is usually misrepresented in the
   literature: indeed Mountford (1962) suggested an approximation to be
-  used as starting
-  value in iterations, but the proper index is defined as the root of
-  the equation
-  above. The function \code{vegdist} solves \eqn{M} with the Newton
-  method. Please note that if either \eqn{a} or \eqn{b} are equal to
-  \eqn{j}, one of the communities could be a subset of other, and the
-  dissimilarity is \eqn{0} meaning that non-identical objects may be
-  regarded as similar and the index is non-metric. The Mountford index
-  is in the range \eqn{0 \dots \log(2)}, but the dissimilarities are
-  divided by \eqn{\log(2)} 
-  so that the results will be in the conventional range \eqn{0 \dots 1}.
+  used as starting value in iterations, but the proper index is
+  defined as the root of the equation above. The function
+  \code{vegdist} solves \eqn{M} with the Newton method. Please note
+  that if either \eqn{a} or \eqn{b} are equal to \eqn{j}, one of the
+  communities could be a subset of other, and the dissimilarity is
+  \eqn{0} meaning that non-identical objects may be regarded as
+  similar and the index is non-metric. The Mountford index is in the
+  range \eqn{0 \dots \log(2)}{0 \dots log(2)}, but the dissimilarities
+  are divided by \eqn{\log(2)}{log(2)} so that the results will be in
+  the conventional range \eqn{0 \dots 1}.
 
   Raup--Crick dissimilarity (\code{method = "raup"}) is a probabilistic
   index based on presence/absence data.  It is defined as \eqn{1 - prob(j)},
@@ -229,24 +228,26 @@
 \author{ Jari Oksanen, with contributions from Tyler Smith (Gower index)
   and Michael Bedward (Raup--Crick index). }
 
-\note{The  function is an alternative to \code{\link{dist}} adding
-  some ecologically meaningful indices.  Both methods should produce
+\note{The function is an alternative to \code{\link{dist}} adding some
+  ecologically meaningful indices.  Both methods should produce
   similar types of objects which can be interchanged in any method
   accepting either.  Manhattan and Euclidean dissimilarities should be
-  identical in both methods. Canberra index is divided by the
-  number of variables in \code{vegdist}, but not in \code{\link{dist}}.
-  So these differ by a constant multiplier, and the alternative in
-  \code{vegdist} is in range (0,1).  Function \code{\link[cluster]{daisy}}
-  (package \pkg{cluster}) provides alternative implementation of Gower
-  index that also can handle mixed data of numeric and class variables. 
-  There are two versions of Gower distance (\code{"gower"}, 
-  \code{"altGower"}) which differ in scaling: \code{"gower"} divides
-  all distances by the number of observations (rows), but 
-  \code{"altGower"} omits double-zeros and divides by the number of 
-  pairs with at least one above-zero value. Gower (1971) suggested this for
-  presences, but it is often taken as the general feature of the Gower 
-  distances. See Examples for implementing the Anderson et al. (2006)
-  variant of the Gower index.
+  identical in both methods. Canberra index is divided by the number
+  of variables in \code{vegdist}, but not in \code{\link{dist}}.  So
+  these differ by a constant multiplier, and the alternative in
+  \code{vegdist} is in range (0,1).  Function
+  \code{\link[cluster]{daisy}} (package \pkg{cluster}) provides
+  alternative implementation of Gower index that also can handle mixed
+  data of numeric and class variables.  There are two versions of
+  Gower distance (\code{"gower"}, \code{"altGower"}) which differ in
+  scaling: \code{"gower"} divides all distances by the number of
+  observations (rows) and scales each column to unit range, but
+  \code{"altGower"} omits double-zeros and divides by the number of
+  pairs with at least one above-zero value, and does not scale columns
+  (Anderson et al. 2006). Gower (1971) suggested omitting double zeros
+  for presences, but it is often taken as the general feature of the
+  Gower distances. See Examples for implementing the Anderson et
+  al. (2006) variant of the Gower index.
 
   Most dissimilarity indices in \code{vegdist} are designed for
   community data, and they will give misleading values if there are