[Vegan-commits] r2906 - in pkg/vegan: . R inst man tests/Examples vignettes

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Thu Nov 6 10:17:30 CET 2014


Author: jarioksa
Date: 2014-11-06 10:17:29 +0100 (Thu, 06 Nov 2014)
New Revision: 2906

Added:
   pkg/vegan/R/ordiareatest.R
Modified:
   pkg/vegan/NAMESPACE
   pkg/vegan/R/estimateR.default.R
   pkg/vegan/R/permustats.R
   pkg/vegan/R/poolaccum.R
   pkg/vegan/R/specpool.R
   pkg/vegan/R/text.cca.R
   pkg/vegan/inst/ChangeLog
   pkg/vegan/inst/NEWS.Rd
   pkg/vegan/man/dune.taxon.Rd
   pkg/vegan/man/ordihull.Rd
   pkg/vegan/man/permustats.Rd
   pkg/vegan/man/specpool.Rd
   pkg/vegan/tests/Examples/vegan-Ex.Rout.save
   pkg/vegan/vignettes/diversity-vegan.Rnw
   pkg/vegan/vignettes/vegan.bib
Log:
Squashed commit of the following:

commit 73619424238a9fe15a6297bdd715a2204f6c9075
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Nov 6 11:08:31 2014 +0200

    Update examples (see comments below for: commsim swhs_samp)

    * CHECK THIS: commsim models seem to have changed with
      allowing double in swsh_samp methods
    * Changes in avoiding dollar for with, and with for data
    * Chao changes, in particular the small sample correction

commit 1f038ef610efc1fc7a317a9aaf87474ed36cbdf9
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Nov 6 10:59:42 2014 +0200

    Bad \usage lines found in documentation object 'specpool'

commit c20e2f330776f260bce1c218ba451bfc1851a26b
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Nov 6 10:51:58 2014 +0200

    Proof read NEWS (and dune.taxon man page accordingly)

commit 9e8329093b8077d7164086bb1f8b899f6a54688f
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Nov 6 10:25:37 2014 +0200

    NEWS about changes in Chao extrapolated richness

commit e92c6af0f46e197c170166104bd69746ea79c48f
Merge: f238aa9 b77c88d
Author: Gavin Simpson <ucfagls at gmail.com>
Date:   Wed Nov 5 07:37:01 2014 -0700

    Merge pull request #64 from jarioksa/ordiareatests

    Add ordiareatests for ordihull and ordiellipse

commit f238aa9c5a20936f0b4df0cfa721643fdc47417c
Merge: 6c108fa 01af54f
Author: Gavin Simpson <ucfagls at gmail.com>
Date:   Wed Nov 5 07:26:32 2014 -0700

    Merge pull request #67 from jarioksa/chao

    Chao equations

commit 01af54fbc842dec7f90879296276485d742d7ab2
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Wed Nov 5 10:11:31 2014 +0200

    sync poolaccum with Chao changes in specpool

    * switch to bias-corrected Chao when a2==0
    * use small-sample correction in Chao (hard-wired)

commit 69c971df80cb088e933d8927e304ac1979aa60e6
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Wed Nov 5 09:56:58 2014 +0200

    estimateR will not use small sample correction

    In addition, clarify man/ page for estimateR

commit 2e2df0e15eb45170519554a16aedd6902fa8d292
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Tue Nov 4 13:41:02 2014 +0200

    Make small sample correction optional in specpool

commit 3af5f26640f18f000911fb908f348f39a77a8b4e
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Tue Nov 4 11:41:59 2014 +0200

    Update diversity vignette for the current state of Chao estimates

commit b5b8b8e19bc2fc0869690cd6120a8e7b39e38ec6
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Tue Nov 4 11:39:02 2014 +0200

    fix small sample correction of var(Chao) when there are no doubletons

    I reconstructed the variance formula (3b) of Chiu et al. 2014 using
    the procedure described in their online appendix. I could get the same
    formula except for the small-sample correction: it seems that EstimateS
    web page got this correctly, but Chiu et al. ignore the small-sample
    correction for all but the first term.

commit 562a531935df234fd204f917a813602ac5e9d3a2
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Mon Nov 3 11:08:03 2014 +0200

    Use EstimateS formula (eq6) for the variance

    The EstimateS web page eq6 formula is actually incorrect: it is
    derived on the partial derivatives of classic Chao, but the terms
    of the final result are replaced with the terms of bias-corrected
    Chao. However, numerically the error is only about 1% of the correct
    form and therefore we resent to use this widespread approximation.

commit 6c108fa55254aa06c0286f8b4e4bdecf4260eae0
Merge: 6819b9d feaf74e
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Sun Nov 2 16:41:57 2014 +0200

    Merge pull request #68 from gavinsimpson/text-cca-labels-and-centroids

    Text cca labels and centroids

commit feaf74e1ac5001831946a2fe8649d2e8d25963f7
Author: Gavin Simpson <ucfagls at gmail.com>
Date:   Sat Nov 1 20:09:48 2014 -0600

    document change to text.cca with centroids and user-supplied labels

commit c439fae28d932500e1992cad2a4b5689c61c2382
Author: Gavin Simpson <ucfagls at gmail.com>
Date:   Sat Nov 1 20:08:10 2014 -0600

    fix testing if centroids in biplot scores when user-supplied labels are given

commit 1768ac0d0eea8171b05791a3ea5e958cab1f5b65
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Fri Oct 31 10:00:27 2014 +0200

    Forgot to take the sd of variance in estimateR

commit cdbaa810c18a6b7924f65aeacf61155a41223f53
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Oct 30 19:28:46 2014 +0200

    Proof-read equations in updated Chao documentation

commit b128c28d8c5c2913d8714236d08c0cffea4736f7
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Oct 30 19:23:29 2014 +0200

    proof-read and format man/specpool.Rd

commit 7ae52d63725e684723a6259193d760d16ccb170b
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Oct 30 18:42:55 2014 +0200

    update man/ pages for upgraded Chao richness

commit 38f01e08de3a3413cc6e96a170b6dbaaee17b223
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Oct 30 18:17:09 2014 +0200

    Use always bias-corrected form in Chao1 of estimateR

commit 0502202e501f878adc38cc5c1a869980c11b3374
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Oct 30 16:49:51 2014 +0200

    Update vignettes to the update Chao richness estimators

commit 4e056265951a001aac8b1b41f3f53c1c80e3bc4a
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Thu Oct 30 16:08:31 2014 +0200

    Finally complete eq. for variance of extrapolated bootstrap richness

commit a34f01f7fdc9a2903fba9517cac2bac6c1e7ef88
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Wed Oct 29 21:00:30 2014 +0200

    Chao index in specpool got small-sample correction and bias-correction

commit a0a2b1fd664214aded5b902eee83e420f47742ae
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Wed Oct 29 20:20:37 2014 +0200

    Chao1 got small-sample correction and bias-corrected variance

    estimateR upgrade following Chun et al., Biometrics 70, 671-682
    (2014). Now the behaviour should be similar as in EstimateS, but
    the variance formulae differ (our based on the cited publication).

commit 016b37945d8ddcd57756cb7ef51c940702e4a3c6
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Wed Oct 29 19:55:56 2014 +0200

    esttimateR: use bias-corrected Chao1 only when doubletons = 0

    Previously we always used bias-corrected form of Chao1, but
    it seems that the normal biased form should normally be used, and
    only when it cannot be estimated (because we divide by the number
    of doubletons).

    The sd of bias-corrected form is made NA in this commit: fixed in
    the next one.

commit b77c88d91e57e08b47a9646947095c2d0d8fa3ab
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Mon Oct 27 11:30:06 2014 +0200

    NEWS about ordiareatests

commit 8117a128d867501b5e1c6dfd5d7811c39e3cff4d
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Sun Oct 26 20:00:08 2014 +0200

    permutations need be transposed in permustats of ordiareatest

commit ffa3878e89fcf67a02e1bc487dde99ab098a3dab
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Sun Oct 26 19:41:53 2014 +0200

    Add permustats.ordiareatest

commit 04eb066c03e5c2262c299944f5dcc82cc5cbf75f
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Sun Oct 26 19:41:07 2014 +0200

    Document and export ordiareatest: passes R CMD check

commit d96189be2f931bd2b017dc404e20e875ffe0c09b
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Sun Oct 26 18:52:09 2014 +0200

    Add print.ordiareatest

commit 86a385a44011dead168f902f343b172aeb83354e
Author: Jari Oksanen <jari.oksanen at oulu.fi>
Date:   Sun Oct 26 18:28:55 2014 +0200

    Add ordiareatest for the areas of convex hulls or ellipses in ordination

    The function is bases on ordihull and ordiellipse and performs permutation
    tests on the sizes of fitted convex hulls or ellipses. The tests are
    one-sided and the alternative hypothesis is that the areas covered by
    convex hulls or ellipses are smaller than random expectations.

Modified: pkg/vegan/NAMESPACE
===================================================================
--- pkg/vegan/NAMESPACE	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/NAMESPACE	2014-11-06 09:17:29 UTC (rev 2906)
@@ -15,6 +15,7 @@
 metaMDSdist, metaMDSiter, metaMDSredist, MDSrotate, metaMDS, monoMDS,
 mrpp, msoplot, mso, multipart, make.commsim, nestedbetajac, nestedbetasor, nestedchecker,
 nesteddisc, nestedn0, nestednodf, nestedtemp, nullmodel, oecosimu,
+ordiareatest,
 ordiR2step, ordiarrows, ordicloud, ordicluster, ordiellipse, ordigrid,
 ordihull, ordilabel, ordiplot, ordipointlabel, ordiresids,
 ordisegments, ordispider, ordisplom, ordistep, ordisurf,
@@ -246,6 +247,7 @@
 S3method(permustats, mantel)
 S3method(permustats, mrpp)
 S3method(permustats, oecosimu)
+S3method(permustats, ordiareatest)
 S3method(permustats, permutest.betadisper)
 S3method(permustats, permutest.cca)
 S3method(permustats, protest)
@@ -362,6 +364,7 @@
 S3method(print, nestedtemp)
 S3method(print, nullmodel)
 S3method(print, oecosimu)
+S3method(print, ordiareatest)
 S3method(print, permat)
 S3method(print, permutest.betadisper)
 S3method(print, permutest.cca)

Modified: pkg/vegan/R/estimateR.default.R
===================================================================
--- pkg/vegan/R/estimateR.default.R	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/R/estimateR.default.R	2014-11-06 09:17:29 UTC (rev 2906)
@@ -28,8 +28,9 @@
     }
     if (!identical(all.equal(x, round(x)), TRUE)) 
         stop("function accepts only integers (counts)")
-    freq <- x[x > 0]
     X <- x[x > 0]
+    N <- sum(X)
+    SSC <- 1 # (N-1)/N # do NOT use small-sample correction
     T.X <- table(X)
     S.obs <- length(X)
     S.rare <- sum(T.X[as.numeric(names(T.X)) <= 10])
@@ -41,9 +42,28 @@
     }
     a <- sapply(i, COUNT, X)
     G <- a[1]/a[2]
-    S.Chao1 <- S.obs + a[1] * (a[1] - 1) / (a[2] + 1)/ 2
+    ## EstimateS uses basic Chao only if a[2] > 0, and switches to
+    ## bias-corrected version only if a[2] == 0. However, we always
+    ## use bias-corrected form. The switchin code is commented out so
+    ## that it is easy to put back.
+
+    ##if (a[2] > 0)
+    ##    S.Chao1 <- S.obs + SSC * a[1]^2/2/a[2]
+    ##else if (a[1] > 0)
+    ##
+    S.Chao1 <- S.obs + SSC * a[1]*(a[1]-1) / (a[2]+1)/2
+    ##else
+    ##    S.Chao1 <- S.obs
     Deriv.Ch1 <- gradF(a, i)
-    sd.Chao1 <- sqrt(a[2] * ((G^4)/4 + G^3 + (G^2)/2))
+    ##if (a[2] > 0)
+    ##    sd.Chao1 <- sqrt(a[2] * (SSC * (SSC * (G^4/4 + G^3) + G^2/2)))
+    ##else if (a[1] > 0)
+    sd.Chao1 <-
+        sqrt(SSC*(a[1]*(a[1]-1)/2/(a[2]+1) +
+                  SSC*(a[1]*(2*a[1]-1)^2/4/(a[2]+1)^2 +
+                       a[1]^2*a[2]*(a[1]-1)^2/4/(a[2]+1)^4)))
+    ##else
+    ##    sd.Chao1 <- 0
     C.ace <- 1 - a[1]/N.rare
     i <- 1:length(a)
     thing <- i * (i - 1) * a

Added: pkg/vegan/R/ordiareatest.R
===================================================================
--- pkg/vegan/R/ordiareatest.R	                        (rev 0)
+++ pkg/vegan/R/ordiareatest.R	2014-11-06 09:17:29 UTC (rev 2906)
@@ -0,0 +1,69 @@
+#' Permutation test for the area of convex hull or ellipse in ordination
+#'
+#' Finds if the area covered by a convex hull or fitted ellipse is
+#' smaller than expected under null hypothesis using permutation test.
+#'
+#' @param ord 2-d ordination
+#' @param factor defining groups
+#' @param are of convex hull of or an ellipse
+#' @param permutations: number, permutation matrix or a
+#' \code{\link[permute]{how}} definition.
+#' @param parallel parallel processing
+#' @param \dots other parameters passed to area functions
+#'
+#' @author Jari Oksanen
+`ordiareatest` <-
+    function(ord, groups, area = c("hull", "ellipse"), permutations = 999,
+             parallel = getOption("mc.cores"), ...)
+{
+    ## Function to find area
+    area <- match.arg(area)
+    areafun <- if (area == "hull") ordihull else ordiellipse
+    areafun <- match.fun(areafun)
+    ## Observed statistics
+    obs <- summary(areafun(ord, groups, draw = "none", ...))["Area",]
+    ## permutations
+    pfun <- function(take, ...)
+        summary(areafun(ord, groups[take], draw = "none", ...))["Area",]
+    perm <- getPermuteMatrix(permutations, length(groups))
+    nperm <- nrow(perm)
+    if (is.null(parallel))
+        parallel <- 1
+    hasClus <- inherits(parallel, "cluster")
+    if ((hasClus || parallel > 1) && require(parallel)) {
+        if(.Platform$OS.type == "unix" && !hasClus) {
+            areas <- do.call(cbind,
+                             mclapply(1:permutations,
+                                      function(i, ...) pfun(perm[i,],...),
+                                        mc.cores = parallel))
+            } else {
+                if (!hasClus) {
+                    parallel <- makeCluster(parallel)
+                }
+                areas <- parApply(parallel, perm, MARGIN=1, pfun)
+                if (!hasClus)
+                    stopCluster(parallel)
+            }
+    } else {
+        areas <- sapply(1:permutations, function(i, ...) pfun(perm[i,], ...))
+    }
+    signif <- (rowSums(areas <= obs) + 1)/(nperm + 1)
+    out <- list("areas" = obs, "pvalues" = signif, "permutations" = areas,
+                nperm = nperm, control = attr(perm, "control"), "kind" = area)
+    class(out) <- "ordiareatest"
+    out
+}
+
+### print method
+
+`print.ordiareatest` <-
+    function(x, ...)
+{
+    qu <- apply(x$permutations, 1, quantile, probs=c(0.05, 0.5))
+    m <- cbind("Area" = x$areas, t(qu), "Pr(<sim)" = x$pvalues)
+    cat("\n")
+    cat(gettextf("Permutation test for the size of ordination %ss\nAlternative hypothesis: observed area is smaller than random %s\n\n", x$kind, x$kind))
+    cat(howHead(x$control), "\n")
+    printCoefmat(m, tst.ind=1:3)
+    invisible(x)
+}

Modified: pkg/vegan/R/permustats.R
===================================================================
--- pkg/vegan/R/permustats.R	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/R/permustats.R	2014-11-06 09:17:29 UTC (rev 2906)
@@ -195,6 +195,16 @@
               class="permustats")
 }
 
+`permustats.ordiareatest` <-
+    function(x, ...)
+{
+    structure(list(
+        "statistic" = x$areas,
+        "permutations" = t(x$permutations),
+        "alternative" = "less"),
+              class = "permustats")
+}
+
 `permustats.permutest.cca` <-
     function(x, ...)
 {

Modified: pkg/vegan/R/poolaccum.R
===================================================================
--- pkg/vegan/R/poolaccum.R	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/R/poolaccum.R	2014-11-06 09:17:29 UTC (rev 2906)
@@ -10,6 +10,9 @@
     ## specpool() is slow, but the vectorized versions below are
     ## pretty fast
     for (i in 1:permutations) {
+        ## It is a bad practice to replicate specpool equations here:
+        ## if we change specpool, this function gets out of sync. You
+        ## should be ashamed, Jari Oksanen!
         take <- sample.int(n, n)
         tmp <- apply(x[take,] > 0, 2, cumsum)
         S[,i] <- rowSums(tmp > 0)
@@ -20,7 +23,8 @@
         boot[,i] <- 2*S[,i] - m + rowSums(exp(sweep(log1p(-sweep(tmp, 1, N, "/")), 1, N, "*") ))
         a1 <- rowSums(tmp == 1)
         a2 <- rowSums(tmp == 2)
-        chao[, i] <- S[,i] + ifelse(a2 > 0, a1*a1/2/a2, 0)
+        chao[, i] <- S[,i] + ifelse(a2 > 0, (N-1)/N*a1*a1/2/a2,
+                                    (N-1)/N*a1*(a1-1)/2)
         jack1[,i] <- S[,i] + a1 * (N-1)/N
         jack2[,i] <- S[,i] + a1*(2*N-3)/N - a2*(N-2)^2/N/(N-1)
     }

Modified: pkg/vegan/R/specpool.R
===================================================================
--- pkg/vegan/R/specpool.R	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/R/specpool.R	2014-11-06 09:17:29 UTC (rev 2906)
@@ -1,5 +1,5 @@
-"specpool" <-
-    function (x, pool) 
+`specpool` <-
+    function (x, pool, smallsample = TRUE) 
 {
     x <- as.matrix(x)
     if (missing(pool)) 
@@ -25,6 +25,10 @@
         n <- length(gr)
         if (n <= 0)
             next
+        if (smallsample)
+            ssc <- (n-1)/n
+        else
+            ssc <- 1
         X <- x[gr, , drop = FALSE]
         freq <- colSums(X > 0)
         p <- freq[freq > 0]/n
@@ -37,8 +41,9 @@
             a2 <- sum(freq == 2)
         else 0
         chao[is] <- S[is] + if(!is.na(a2) && a2 > 0)
-            a1 * a1/2/a2
-        else 0
+            ssc * a1 * a1/2/a2
+        else
+            ssc * a1 * (a1-1)/2
         jack.1[is] <- S[is] + a1 * (n - 1)/n
         jack.2[is] <- S[is] + a1 * (2 * n - 3)/n - a2 * (n - 
                                                          2)^2/n/(n - 1)
@@ -46,7 +51,11 @@
         aa <- if (!is.na(a2) && a2 > 0) 
             a1/a2
         else 0
-        var.chao[is] <- a2 * (0.5 + (1 + aa/4) * aa) * aa * aa
+        if (a2 > 0)
+            var.chao[is] <- a2 * ssc * (0.5 + ssc * (1 + aa/4) * aa) * aa * aa
+        else
+            var.chao[is] <-
+                ssc * (ssc * (a1*(2*a1-1)^2/4 - a1^4/chao[is]/4) + a1*(a1-1)/2)
         if (!is.na(a1) && a1 > 0) {
             jf <- table(rowSums(X[, freq == 1, drop = FALSE] > 
                                 0))

Modified: pkg/vegan/R/text.cca.R
===================================================================
--- pkg/vegan/R/text.cca.R	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/R/text.cca.R	2014-11-06 09:17:29 UTC (rev 2906)
@@ -7,13 +7,16 @@
         stop("only one 'display' item can be added in one command")
     pts <- scores(x, choices = choices, display = display, scaling = scaling,
                   const)
+    ## store rownames of pts for use later, otherwise if user supplies
+    ## labels, the checks in "cn" branch fail and "bp" branch will
+    ## be entered even if there should be no "bp" plotting
+    cnam <- rownames(pts)
     if (!missing(labels))
         rownames(pts) <- labels
     if (!missing(select))
         pts <- .checkSelect(select, pts)
     if (display == "cn") {
-        cnam <- rownames(pts)
-        text(pts, labels = cnam, ...)
+        text(pts, labels = rownames(pts), ...)
         pts <- scores(x, choices = choices, display = "bp", scaling = scaling,
                       const)
         bnam <- rownames(pts)

Modified: pkg/vegan/inst/ChangeLog
===================================================================
--- pkg/vegan/inst/ChangeLog	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/inst/ChangeLog	2014-11-06 09:17:29 UTC (rev 2906)
@@ -42,6 +42,10 @@
 	was specified using `control` which is not an argument to the
 	`permutest()` method for `betadisper()`.
 
+	* text.cca: was incorrectly testing if factor constraints were
+	in the biplot scores when user-suppiled lables for factors were
+	given.
+
 	* ccanova: removed from vegan. These were backup functions of old
 	anova.cca, and were completely rewritten for new vegan.
 

Modified: pkg/vegan/inst/NEWS.Rd
===================================================================
--- pkg/vegan/inst/NEWS.Rd	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/inst/NEWS.Rd	2014-11-06 09:17:29 UTC (rev 2906)
@@ -13,7 +13,7 @@
       integer giving the number of parallel processes. In unix-alikes
       (Mac OS, Linux) this will launch \code{"multicore"} processing
       and in Windows it will set up \code{"snow"} clusters as desribed
-      in the documentation of \pkg{parallel} package. If \code{option}
+      in the documentation of the \pkg{parallel} package. If \code{option}
       \code{"mc.cores"} is set to an integer > 1, this will be used to
       automatically start parallel processing. Finally, the argument
       can also be a previously set up \code{"snow"} cluster which will
@@ -23,21 +23,20 @@
       extensive documentation on parallel processing in \R.
 
       The following function use parallel processing in analysing
-      permutatin statistics: \code{adonis}, \code{anosim},
+      permutation statistics: \code{adonis}, \code{anosim},
       \code{anova.cca} (and \code{permutest.cca}), \code{mantel} (and
-      \code{mantel.partial}), \code{mrpp}, \code{permutest.betadisper}
-      and \code{simper}. In addition, \code{bioenv} can compare
-      several candidate sets of models in paralle, \code{metaMDS} can
-      launch several random starts in parallel, and \code{oecosimu}
-      can evaluate test statistics for several null models in
-      parallel.
+      \code{mantel.partial}), \code{mrpp}, \code{ordiareatest},
+      \code{permutest.betadisper} and \code{simper}. In addition,
+      \code{bioenv} can compare several candidate sets of models in
+      paralle, \code{metaMDS} can launch several random starts in
+      parallel, and \code{oecosimu} can evaluate test statistics for
+      several null models in parallel.
 
-      \item All permutation tests are based on the \pkg{permute}
-      package which offers strong tools for restricted
-      permutation. All these functions have argument
-      \code{permutations}. The default usage of simple non-restricted
-      permutations is achieved by giving a single integer number as
-      its argument. Restricted permutations can be defined using the
+      \item All permutation tests are based on the \pkg{permute} package
+      which offers strong tools for restricted permutation. All these
+      functions have argument \code{permutations}. The default usage of
+      simple non-restricted permutations is achieved by giving a single
+      integer number. Restricted permutations can be defined using the
       \code{how} function of the \pkg{permute} package. Finally, the
       argument can be a permutation matrix where rows define
       permutations. It is possible to use external or user constructed
@@ -50,13 +49,13 @@
       \code{vegandocs("permutations")}.
 
       The following functions use the \pkg{permute} package:
-      \code{CCorA}, \code{adonis}, \code{anosim}, \code{anova.cca}
-      (plus associated \code{permutest.cca}, \code{add1.cca},
+      \code{CCorA}, \code{adonis}, \code{anosim}, \code{anova.cca} (plus
+      associated \code{permutest.cca}, \code{add1.cca},
       \code{drop1.cca}, \code{ordistep}, \code{ordiR2step}),
       \code{envfit} (plus associated \code{factorfit} and
       \code{vectorfit}), \code{mantel} (and \code{mantel.partial}),
-      \code{mrpp}, \code{mso}, \code{permutest.betadisper},
-      \code{protest} and \code{simper}.
+      \code{mrpp}, \code{mso}, \code{ordiareatest},
+      \code{permutest.betadisper}, \code{protest} and \code{simper}.
 
       \item Community null model generation has been completely
       redesigned and rewritten. The communities are constructed with
@@ -103,27 +102,32 @@
       \item New \code{hclust} support functions \code{reorder},
       \code{rev} and \code{scores}. Functions \code{reorder} and
       \code{rev} are similar as these functions for \code{dendrogram}
-      objects. However, \code{reorder} can use (and defaults to)
-      weighted mean. In weighted mean the node average is always the
+      objects in base \R. However, \code{reorder} can use (and defaults
+      to) weighted mean. In weighted mean the node average is always the
       mean of member leaves, whereas the \code{dendrogram} uses always
-      the unweighted means of joined branches.
+      unweighted means of joined branches.
 
-      \item \code{permustats} to extract and inspect permutation
+      \item Function \code{ordiareatest} supplements \code{ordihull} and
+      \code{ordiellipse} and provides a randomization test for the
+      one-sided alternative hypothesis that convex hulls or ellipses in
+      two-dimensional ordination space have smaller areas than with
+      randomized groups.
+
+      \item Function \code{permustats} extracts and inspects permutation
       results with support functions \code{summary}, \code{density},
       \code{densityplot}, \code{qqnorm} and \code{qqmath}. The
-      \code{density} and \code{qqnorm} are standard \R{} tools that
-      only work with one statistic, and \code{densityplot} and
-      \code{qqmath} are \pkg{lattice} graphics that work with
-      univariate and multivariate statistics. The results of following
-      functions can be extracted: \code{anosim}, \code{adonis},
-      \code{mantel} (and \code{mantel.partial}), \code{mrpp},
-      \code{oecosimu}, \code{permustest.cca} (but not the
-      corresponding \code{anova} methods),
-      \code{permutest.betadisper}, and \code{protest}.
+      \code{density} and \code{qqnorm} are standard \R{} tools that only
+      work with one statistic, and \code{densityplot} and \code{qqmath}
+      are \pkg{lattice} graphics that work with univariate and
+      multivariate statistics. The results of following functions can be
+      extracted: \code{anosim}, \code{adonis}, \code{mantel} (and
+      \code{mantel.partial}), \code{mrpp}, \code{oecosimu},
+      \code{permustest.cca} (but not the corresponding \code{anova}
+      methods), \code{permutest.betadisper}, and \code{protest}.
 
-      \item \code{stressplot} functions display the ordination
-      distances at given number of dimensions against original
-      distances.  These are a generalization of \code{stressplot} for
+      \item \code{stressplot} functions display the ordination distances
+      at given number of dimensions against original distances.  The
+      method functins are similar to \code{stressplot} for
       \code{metaMDS}, and always use the inherent distances of each
       ordination method. The functions are available for the results
       \code{capscale}, \code{cca}, \code{princomp}, \code{prcomp},
@@ -139,7 +143,7 @@
       of a random value. 
 
       \item \code{ordiellipse} can handle points exactly on a line,
-      including only two points.
+      including only two points (with a warning).
 
       \item plotting \code{radfit} results for several species failed if
       any of the communities had no species or had only one species.
@@ -199,15 +203,41 @@
       \item \code{rankindex} can use Manhattan, Gower and Mahalanobis
       distance in addition to the default Euclidean.
 
-      \item User can set colours and line types in unction
+      \item User can set colours and line types in function
       \code{rarecurve} for plotting rarefaction curves.
 
       \item \code{spantree} gained a support function \code{as.hclust}
-      to change the minimum spannig tree into an \code{hclust} tree.
+      to change the minimum spanning tree into an \code{hclust} tree.
 
       \item \code{fitspecaccum} can do weighted analysis. Gained
       \code{lines} method.
 
+      \item Functions for extrapolated number of species or for the size
+      of species pool using Chao method were modified following Chiu et
+      al., \emph{Biometrics} 70, 671--682 (2014).
+
+      Incidence based \code{specpool} can now use (and defaults to)
+      small sample correction with number of sites as the sample
+      size. Function uses basic Chao extrapolation based on the ratio of
+      singletons and doubletons, but switches now to bias corrected Chao
+      extrapolation if there are no doubletons (species found
+      twice). The variance formula for bias corrected Chao was derived
+      following the supporting
+      \href{http://onlinelibrary.wiley.com/doi/10.1111/biom.12200/suppinfo}{online material}
+      and differs slightly from Chiu et al. (2014).
+
+      The \code{poolaccum} function was changed similarly, but the small
+      sample correction is used always.
+
+      The abundance based \code{estimateR} uses bias corrected Chao
+      extrapolation, but earlier it estimated its variance with classic
+      Chao model. Now we use the widespread
+      \href{http://viceroy.eeb.uconn.edu/EstimateS/EstimateSPages/EstSUsersGuide/EstimateSUsersGuide.htm#AppendixB}{approximate
+	equation} for variance.
+
+      With these changes these functions are more similar to
+            \href{http://viceroy.eeb.uconn.edu/EstimateS/EstimateSPages/EstSUsersGuide/EstimateSUsersGuide.htm#AppendixB}{EstimateS}.
+
       \item \code{tabasco} uses now \code{reorder.hclust} for
       \code{hclust} object for better ordering than previously when it
       cast trees to \code{dendrogram} objects.
@@ -224,7 +254,7 @@
       III.  \code{varespec} and \code{dune} use 8-character names (4
       from genus + 4 from species epithet). New data set on
       phylogenetic distances for \code{dune} was extracted from Zanne
-      et al. (\emph{Nature}, 2014).
+      et al. (\emph{Nature} 506, 89--92; 2014).
 
       \item User configurable plots for \code{rarecurve}.
 
@@ -243,12 +273,12 @@
       extract scaled results.
 
       \item \code{commsimulator} is deprecated. Replace
-      \code{commsimulator(x, method)} with \code{simulate(nullmodel(x,
-      method))}.
+      \code{commsimulator(x, method)} with
+      \code{simulate(nullmodel(x, method))}.
 
       \item \code{density} and \code{densityplot} for permutation
-      results: use \code{permustats} with its \code{density} and
-      \code{densityplot} method.
+      results are deprecated: use \code{permustats} with its
+      \code{density} and \code{densityplot} method.
 
     } %end itemize
   } % end deprecated

Modified: pkg/vegan/man/dune.taxon.Rd
===================================================================
--- pkg/vegan/man/dune.taxon.Rd	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/man/dune.taxon.Rd	2014-11-06 09:17:29 UTC (rev 2906)
@@ -39,8 +39,7 @@
   I.J., Aarssen, L., Bertin, R.I., Calaminus, A., Govaerts, R.,
   Hemmings, F., Leishman, M.R., Oleksyn, J., Soltis, P.S., Swenson,
   N.G., Warman, L. & Beaulieu, J.M. (2014) Three keys to the radiation
-  of angiosperms into freezing environments. \emph{Nature}
-  doi:10.1038/nature12872 (published online Dec 22, 2013).
+  of angiosperms into freezing environments. \emph{Nature} 506, 89--92.
 
 }
 

Modified: pkg/vegan/man/ordihull.Rd
===================================================================
--- pkg/vegan/man/ordihull.Rd	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/man/ordihull.Rd	2014-11-06 09:17:29 UTC (rev 2906)
@@ -9,6 +9,7 @@
 \alias{summary.ordihull}
 \alias{scores.ordihull}
 \alias{summary.ordiellipse}
+\alias{ordiareatest}
 
 \title{Display Groups or Factor Levels in Ordination Diagrams}
 
@@ -30,6 +31,8 @@
          w = weights(ord, display), ...)
 \method{summary}{ordihull}(object, ...)
 \method{summary}{ordiellipse}(object, ...)
+ordiareatest(ord, groups, area = c("hull", "ellipse"), permutations = 999,
+         parallel = getOption("mc.cores"), ...)
 }
 
 \arguments{
@@ -96,6 +99,19 @@
     can be saved, and used for summaries (areas etc. of hulls and
     ellipses). }
 
+  \item{area}{Evaluate the area of convex hulls of \code{ordihull}, or of
+    ellipses of \code{ordiellipse}.}
+
+  \item{permutations}{a list of control values for the permutations
+    as returned by the function \code{\link[permute]{how}}, or the
+    number of permutations required, or a permutation matrix where each
+    row gives the permuted indices.}
+
+  \item{parallel}{Number of parallel processes or a predefined socket
+    cluster.  With \code{parallel = 1} uses ordinary, non-parallel
+    processing. The parallel processing is done with \pkg{parallel}
+    package.}
+
   \item{\dots}{Parameters passed to graphical functions or to
     \code{\link{scores}} to select axes and scaling etc. } 
 }
@@ -114,6 +130,12 @@
   An ellipsoid hull can be drawn with function
   \code{\link[cluster]{ellipsoidhull}} of package \pkg{cluster}.
 
+  Function \code{ordihull} and \code{ordiellipse} return invisibly an
+  object that has a \code{summary} method that returns the coordinates
+  of centroids and areas of the hulls or ellipses. Function
+  \code{ordiareatest} studies the one-sided hypothesis that these
+  areas are smaller than with randomized \code{groups}.
+
   Function \code{ordispider} draws a \sQuote{spider} diagram where
   each point is connected to the group centroid with
   \code{\link{segments}}.  Weighted centroids are used in the
@@ -151,10 +173,7 @@
   Function \code{ordihull} returns a list of coordinates of the hulls
   (which can be extracted with \code{scores}), and \code{ordiellipse}
   returns a list of covariance matrices and scales used in drawing the
-  ellipses.  These result objects have a \code{summary} method that
-  returns the coordinates of the centres of the ellipses or hulls and
-  their surface areas in user units.  With \code{draw = "none"} only
-  the result object is returned and nothing is drawn.
+  ellipses.
 
 }
 

Modified: pkg/vegan/man/permustats.Rd
===================================================================
--- pkg/vegan/man/permustats.Rd	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/man/permustats.Rd	2014-11-06 09:17:29 UTC (rev 2906)
@@ -9,6 +9,7 @@
 \alias{permustats.mrpp}
 \alias{permustats.mso}
 \alias{permustats.oecosimu}
+\alias{permustats.ordiareatest}
 \alias{permustats.permutest.betadisper}
 \alias{permustats.permutest.cca}
 \alias{permustats.protest}
@@ -92,7 +93,7 @@
   The \code{permustats} can extract permutation statistics from the
   results of \code{\link{adonis}}, \code{\link{anosim}},
   \code{\link{mantel}}, \code{\link{mantel.partial}},
-  \code{\link{mrpp}}, \code{\link{oecosimu}},
+  \code{\link{mrpp}}, \code{\link{oecosimu}}, \code{\link{ordiareatest}},
   \code{\link{permutest.cca}}, \code{\link{protest}}, and
   \code{\link{permutest.betadisper}}. NB, there is no \code{permustats}
   method for \code{\link{anova.cca}}, but only for

Modified: pkg/vegan/man/specpool.Rd
===================================================================
--- pkg/vegan/man/specpool.Rd	2014-10-29 08:36:32 UTC (rev 2905)
+++ pkg/vegan/man/specpool.Rd	2014-11-06 09:17:29 UTC (rev 2906)
@@ -19,7 +19,7 @@
   is based on abundances (counts) on single sample site. 
 }
 \usage{
-specpool(x, pool)
+specpool(x, pool, smallsample = TRUE)
 estimateR(x, ...)
 specpool2vect(X, index = c("jack1","jack2", "chao", "boot","Species"))
 poolaccum(x, permutations = 100, minsize = 3)
@@ -33,6 +33,8 @@
     for \code{plot} function.}
   \item{pool}{A vector giving a classification for pooling the sites in
     the species data. If missing, all sites are pooled together.}
+  \item{smallsample}{Use small sample correction \eqn{(N-1)/N}, where
+    \eqn{N} is the number of sites within the \code{pool}.}
   \item{X, object}{A \code{specpool} result object.}
   \item{index}{The selected index of extrapolated richness.}
   \item{permutations}{Number of permutations of sampling order of sites.}
@@ -59,33 +61,39 @@
   sites in the collection.  The variants of extrapolated richness in
   \code{specpool} are:
   \tabular{ll}{
-    Chao
-    \tab \eqn{S_P = S_0 + a1^2/(2*a2)}
+     Chao
+    \tab \eqn{S_P = S_0 + \frac{a_1^2}{2 a_2}\frac{N-1}{N}}{S_P = S_0 + a1^2/(2*a2) * (N-1)/N}
     \cr
+    Chao bias-corrected
+    \tab \eqn{S_P = S_0 + \frac{a_1(a_1-1)}{2(a_2+1)} \frac{N-1}{N}}{S_P = S_0 + a1*(a1-1)/(2*(a2+1)) * (N-1)/N}
+    \cr
     First order jackknife
     \tab \eqn{S_P = S_0 + a_1 \frac{N-1}{N}}{S_P = S_0 + a1*(N-1)/N}
     \cr
     Second order jackknife
     \tab \eqn{S_P = S_0 + a_1 \frac{2N - 3}{N} - a_2 \frac{(N-2)^2}{N
-	(N-1)}}{S_P = S_0 + a1*(2*n-3)/n - a2*(n-2)^2/n/(n-1)}
+	(N-1)}}{S_P = S_0 + a1*(2*N-3)/N - a2*(N-2)^2/N/(N-1)}
     \cr
     Bootstrap
     \tab \eqn{S_P = S_0 + \sum_{i=1}^{S_0} (1 - p_i)^N}{S_P = S_0 + Sum
       (1-p_i)^N}
     }
+    \code{specpool} normally uses basic Chao equation, but when there
+    are no doubletons (\eqn{a2=0}) it switches to bias-corrected
+    version. In that case the Chao equation simplifies to
+    \eqn{S_0 + \frac{1}{2} a_1 (a_1-1) \frac{N-1}{N}}{S_0 + (N-1)/N * a1*(a1-1)/2}.
 
-    The abundance-based estimates in \code{estimateR} use counts (frequencies) of
-    species in a single site. If called for a matrix or data frame, the
-    function will give separate estimates for each site.  The two
-    variants of extrapolated richness in \code{estimateR} are Chao
-    (unbiased variant) and ACE.  In the Chao estimate
-    \eqn{a_i} refers to number of species with abundance \eqn{i} instead
-    of incidence: 
+    The abundance-based estimates in \code{estimateR} use counts
+    (numbers of individuals) of species in a single site. If called for
+    a matrix or data frame, the function will give separate estimates
+    for each site.  The two variants of extrapolated richness in
+    \code{estimateR} are bias-corrected Chao and ACE (O'Hara 2005, Chiu
+    et al. 2014).  The Chao estimate is similar as the bias corrected
+    one above, but \eqn{a_i} refers to the number of species with
+    abundance \eqn{i} instead of number of sites, and the small-sample
+    correction is not used. The ACE estimate is defined as:
+
     \tabular{ll}{
-    Chao
-    \tab \eqn{S_P = S_0 + \frac{a_1 (a_1 -1)}{2 (a_2 + 1)}}{S_P = S_0 +
-      a1*(a1-1)/(2*(a2+1))}
-    \cr
     ACE
     \tab \eqn{S_P = S_{abund} + \frac{S_{rare}}{C_{ace}}+ \frac{a_1}{C_{ace}}
       \gamma^2_{ace}}{S_P = S_abund + S_rare/C_ace + a1/C_ace * gamma^2}
@@ -108,16 +116,16 @@
 
     Functions estimate the standard errors of the estimates. These
     only concern the number of added species, and assume that there is
-    no variance in the observed richness.
-    The equations of standard errors are too complicated to be reproduced in
-    this help page, but they can be studied in the \R source code of the
-    function.
-    The standard error are based on the following sources: Chao (1987)
-    for the Chao estimate and Smith and van Belle (1984) for the
-    first-order Jackknife and the bootstrap (second-order jackknife is
-    still missing). 
-    The variance estimator of \eqn{S_{ace}}{S_ace} was
-    developed by Bob O'Hara (unpublished).
+    no variance in the observed richness.  The equations of standard
+    errors are too complicated to be reproduced in this help page, but
+    they can be studied in the \R source code of the function and are
+    discussed in the \code{\link{vignette}} \dQuote{diversity-vegan}
+    that can be read with the \code{\link{vegandocs}} command. The
+    standard error are based on the following sources: Chiu et
+    al. (2014) for the Chao estimates and Smith and van Belle (1984)
+    for the first-order Jackknife and the bootstrap (second-order
+    jackknife is still missing).  For the variance estimator of
+    \eqn{S_{ace}}{S_ace} see O'Hara (2005).
 
   Functions \code{poolaccum} and \code{estaccumR} are similar to
   \code{\link{specaccum}}, but estimate extrapolated richness indices
@@ -127,12 +135,14 @@
   data. The functions share \code{summary} and \code{plot}
   methods. The \code{summary} returns quantile envelopes of
   permutations corresponding the given level of \code{alpha} and
-  standard deviation of permutations for each sample size. The
+  standard deviation of permutations for each sample size. NB., these
+  are not based on standard deviations estimated within \code{specpool}
+  or \code{estimateR}, but they are based on permutations. The
   \code{plot} function shows the mean and envelope of permutations
   with given \code{alpha} for models. The selection of models can be
   restricted and order changes using the \code{display} argument in
   \code{summary} or \code{plot}. For configuration of \code{plot}
-  command, see \code{\link[lattice]{xyplot}}
+  command, see \code{\link[lattice]{xyplot}}.
 }
 
 \value{
@@ -151,11 +161,18 @@
 \references{
   Chao, A. (1987). Estimating the population size for capture-recapture
   data with unequal catchability. \emph{Biometrics} 43, 783--791.
+
+  Chiu, C.H., Wang, Y.T., Walther, B.A. & Chao, A. (2014). Improved
+  nonparametric lower bound of species richness via a modified
+  Good-Turing frequency formula. \emph{Biometrics} 70, 671--682.
   
   Colwell, R.K. & Coddington, J.A. (1994). Estimating terrestrial
   biodiversity through
   extrapolation. \emph{Phil. Trans. Roy. Soc. London} B 345, 101--118.
 
+  O'Hara, R.B. (2005). Species richness estimators: how many species
+  can dance on the head of a pin? \emph{J. Anim. Ecol.} 74, 375--386.
+
   Palmer, M.W. (1990). The estimation of species richness by
   extrapolation. \emph{Ecology} 71, 1195--1198.
 
@@ -163,12 +180,19 @@
   species richness. \emph{Biometrics} 40, 119--129.
 }
 \author{Bob O'Hara (\code{estimateR}) and Jari Oksanen.}
-\note{
-  The functions are based on assumption that there is a species pool:
-  The community is closed so that there is a fixed pool size \eqn{S_P}.
-  Such cases may exist, although I have not seen them yet.  All indices
-  are biased for open communities.
 
+\note{ The functions are based on assumption that there is a species
+  pool: The community is closed so that there is a fixed pool size
+  \eqn{S_P}.  In general, the functions give only the lower limit of
+  species richness: the real richness is \eqn{S >= S_P}, and there is
+  a consistent bias in the estimates. Even the bias-correction in Chao
+  only reduces the bias, but does not remove it completely (Chiu et
+  al. 2014).
+
+  Optional small sample correction was added to \code{specpool} in
+  \pkg{vegan} 2.2-0. It was not used in the older literature (Chao
+  1987), but it is recommended recently (Chiu et al. 2014).
+
   See \url{http://viceroy.eeb.uconn.edu/EstimateS} for a more complete
   (and positive) discussion and alternative software for some platforms.
 }

[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/vegan -r 2906


More information about the Vegan-commits mailing list