[Vegan-commits] r1703 - in pkg/vegan/inst: . doc
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Tue Aug 9 20:52:08 CEST 2011
Author: jarioksa
Date: 2011-08-09 20:52:00 +0200 (Tue, 09 Aug 2011)
New Revision: 1703
Modified:
pkg/vegan/inst/ChangeLog
pkg/vegan/inst/doc/decision-vegan.Rnw
pkg/vegan/inst/doc/diversity-vegan.Rnw
pkg/vegan/inst/doc/intro-vegan.Rnw
Log:
vignettes use now jss style shipped with R
Modified: pkg/vegan/inst/ChangeLog
===================================================================
--- pkg/vegan/inst/ChangeLog 2011-08-09 09:00:27 UTC (rev 1702)
+++ pkg/vegan/inst/ChangeLog 2011-08-09 18:52:00 UTC (rev 1703)
@@ -13,6 +13,9 @@
Imports field of DESCRIPTION (same with vif.cca: vif is defined in
car, but we could have our private vif generic here?).
+ * Vignettes: use now jss.cls shipped with R instead of amsart.cls
+ for better consistency with R and permute style.
+
Version 1.90-2 (closed August 6, 2011)
* ordilabel: gained argument 'select'.
Modified: pkg/vegan/inst/doc/decision-vegan.Rnw
===================================================================
--- pkg/vegan/inst/doc/decision-vegan.Rnw 2011-08-09 09:00:27 UTC (rev 1702)
+++ pkg/vegan/inst/doc/decision-vegan.Rnw 2011-08-09 18:52:00 UTC (rev 1703)
@@ -1,52 +1,50 @@
% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
%\VignetteIndexEntry{Design decisions and implementation}
-\documentclass[a4paper,10pt]{amsart}
+\documentclass[article,a4paper,10pt,nojss]{jss}
-\usepackage{ucs}
-\usepackage[utf8x]{inputenc}
-\usepackage[T1]{fontenc}
-\usepackage[sort&compress]{natbib}
-\usepackage{hyperref}
-\usepackage{graphics}
+\usepackage{amsmath}
+%\usepackage{ucs}
+%\usepackage[utf8x]{inputenc}
+%\usepackage[T1]{fontenc}
\usepackage{sidecap}
-\setlength{\captionindent}{0pt}
-\usepackage{url}
\renewcommand{\floatpagefraction}{0.8}
\author{Jari Oksanen}
\title{Design decisions and implementation details in vegan}
-\date{$ $Id$ $
+\Abstract{
+ This document describes design decisions, and discusses implementation
+and algorithmic details in some vegan functions. The proper FAQ is
+another document.
+ }
+ \Keywords{nestdness, matrix temperature, community null models, scaling of PCA and RDA, WA
+ and LC scores}
+
+\Address{$ $Id$ $
processed with vegan
\Sexpr{packageDescription("vegan", field="Version")}
in \Sexpr{R.version.string} on \today}
+\Footername{This version}
+%% need no \usepackage{Sweave.sty}
\begin{document}
\SweaveOpts{strip.white=true}
\setkeys{Gin}{width=0.55\linewidth}
<<echo=false,results=hide>>=
figset <- function() par(mar=c(4,4,1,1)+.1)
options(SweaveHooks = list(fig = figset))
+options("prompt" = "R> ", "continue" = "+ ")
require(vegan)
@
\maketitle
-\begin{abstract}
-\noindent This document describes design decisions, and discusses implementation
-and algorithmic details in some vegan functions. The proper FAQ is
-another document.
-
-\end{abstract}
-
-\tableofcontents
-
\section{Nestedness and Null models}
Some indicators of nestedness and null models of communities are only
described in general terms, and they could be implemented in various
-ways. Here I discuss the implementation in \texttt{vegan}.
+ways. Here I discuss the implementation in \pkg{vegan}.
\subsection{Matrix temperature}
@@ -81,12 +79,12 @@
principles. Rodr{\'i}guez-Giron{\'e}s and Santamaria \cite{RodGir06}
have seen the original code and reveal more details of calculations,
and their explanation is the basis of the implementation in
-\texttt{vegan}. However, there are still some open issues, and
-probably \texttt{vegan} function \texttt{nestedtemp} will never
+\pkg{vegan}. However, there are still some open issues, and
+probably \pkg{vegan} function \code{nestedtemp} will never
exactly reproduce results from other programs, although it is based on
the same general principles. I try to give main computation details in
this documents --- all details can be seen in the source code of
-\texttt{nestedtemp}.
+\code{nestedtemp}.
\begin{itemize}
\item Species and sites are put into unit square \citep{RodGir06}. The
@@ -110,8 +108,8 @@
the indices can be reversed for corresponding row indexing.
Ordering by $s$ packs presences to the top left corner, and
ordering by $t$ pack zeros away from the top left corner. The final
- sorting should be ``a compromise'' \cite{RodGir06} between these
- scores, and \texttt{vegan} uses $s+t$. The result should be cool,
+ sorting should be ``a compromise'' \citep{RodGir06} between these
+ scores, and \pkg{vegan} uses $s+t$. The result should be cool,
but the packing does not try to minimize the temperature
\citep{RodGir06}. I do not know how the ``compromise'' is
defined, and this can cause some differences to other
@@ -128,20 +126,20 @@
the parameter $p$ is selected so that the curve covers the same
area as is the proportion of presences
(Fig. \ref{fig:nestedtemp}). The parameter $p$ is found
- numerically using \textsf{R} functions \texttt{integrate} and
- \texttt{uniroot}. The fill line used in the original matrix
- temperature software \cite{AtmarPat93} is supposed to be similar
+ numerically using \proglang{R} functions \code{integrate} and
+ \code{uniroot}. The fill line used in the original matrix
+ temperature software \citep{AtmarPat93} is supposed to be similar
\citep{RodGir06}. Small details in the fill line combined with
differences in scores used in the unit square (especially in the
corners) can cause large differences in the results.
\item A line with slope $-1$ is drawn through the point and the $x$
coordinate of the intersection of this line and the fill line is
- found using function \texttt{uniroot}. The difference of this
+ found using function \code{uniroot}. The difference of this
intersection and the row coordinate gives the argument $d$ of matrix
temperature (Fig. \ref{fig:nestedtemp}).
\item In other software, ``duplicated'' species occurring on every
site are removed, as well as empty sites and species after
- reordering \cite{RodGir06}. This is not done in \texttt{vegan}.
+ reordering \cite{RodGir06}. This is not done in \pkg{vegan}.
\end{itemize}
\subsection{Backtracking}
@@ -160,7 +158,7 @@
\cite{GotelliEnt01} does not give many hints on implementing a fill
algorithm as a community null model.
-The backtracking is implemented in two stages in \textbf{vegan}: filling and
+The backtracking is implemented in two stages in \pkg{vegan}: filling and
backtracking.
\begin{enumerate}
\item The matrix is filled in the order given by the marginal
@@ -192,11 +190,11 @@
This chapter discusses the scaling of scores (results) in redundancy
analysis and principal component analysis performed by function
-\texttt{rda} in the \texttt{vegan} library.
+\code{rda} in the \pkg{vegan} library.
Principal component analysis, and hence redundancy analysis, is a case
of singular value decomposition (\textsc{svd}). Functions
-\texttt{rda} and \texttt{prcomp} even use \textsc{svd} internally in
+\code{rda} and \code{prcomp} even use \textsc{svd} internally in
their algorithm.
In \textsc{svd} a centred data matrix is decomposed into orthogonal
@@ -239,14 +237,14 @@
\begin{table}
\caption{\label{tab:scales} Alternative scalings for \textsc{rda} used
- in the functions \texttt{prcomp} and \texttt{princomp}, and the
- one used in the \texttt{vegan} function \texttt{rda}
- and the proprietary software \texttt{Canoco}
+ in the functions \code{prcomp} and \code{princomp}, and the
+ one used in the \pkg{vegan} function \code{rda}
+ and the proprietary software \proglang{Canoco}
scores in terms of orthonormal species ($u_{ik}$) and site scores
($v_{jk}$), eigenvalues ($\lambda_k$), number of sites ($n$) and
- species standard deviations ($s_j$). In \texttt{rda},
+ species standard deviations ($s_j$). In \code{rda},
$\mathrm{const} = \sqrt[4]{(n-1) \sum \lambda_k}$. Corresponding
- negative scaling in \texttt{vegan}
+ negative scaling in \pkg{vegan}
% and corresponding positive scaling in \texttt{Canoco 3}
is derived
dividing each species by its standard deviation $s_j$ (possibly
@@ -254,31 +252,31 @@
\begin{tabular}{lcc}
& \textbf{Site scores} $u_{ik}^*$ &
\textbf{Species scores} $v_{jk}^*$ \\
-\texttt{prcomp, princomp} &
+\code{prcomp, princomp} &
$u_{ik} \sqrt{n-1} \sqrt{\lambda_k}$ &
$v_{jk}$ \\
-\texttt{rda, scaling=1} &
+\code{rda, scaling=1} &
$u_{ik} \sqrt{\lambda_k/ \sum \lambda_k} \times \mathrm{const}$ &
$v_{jk} \times \mathrm{const}$
\\
-\texttt{rda, scaling=2} &
+\code{rda, scaling=2} &
$u_{ik} \times \mathrm{const}$ &
$v_{jk} \sqrt{\lambda_k/ \sum \lambda_k} \times \mathrm{const}$ \\
-\texttt{rda, scaling=3} &
+\code{rda, scaling=3} &
$u_{ik} \sqrt[4]{\lambda_k/ \sum \lambda_k} \times \mathrm{const}$ &
$v_{jk} \sqrt[4]{\lambda_k/ \sum \lambda_k} \times \mathrm{const}$ \\
-\texttt{rda, scaling < 0} &
+\code{rda, scaling < 0} &
$u_{ik}^*$ &
$\sqrt{\sum \lambda_k /(n-1)} s_j^{-1} v_{jk}^*$
% \\
-% \texttt{Canoco 3, scaling=-1} &
+% \code{Canoco 3, scaling=-1} &
% $u_{ik} \sqrt{n-1} \sqrt{\lambda_k / \sum \lambda_k}$ &
% $v_{jk} \sqrt{n}$ \\
-% \texttt{Canoco 3, scaling=-2} &
+% \code{Canoco 3, scaling=-2} &
% $u_{ik} \sqrt{n-1}$ &
% $v_{jk} \sqrt{n} \sqrt{\lambda_k / \sum \lambda_k}$
% \\
-% \texttt{Canoco 3, scaling=-3} &
+% \code{Canoco 3, scaling=-3} &
% $u_{ik} \sqrt{n-1} \sqrt[4]{\lambda_k / \sum \lambda_k}$ &
% $v_{jk} \sqrt{n} \sqrt[4]{\lambda_k / \sum \lambda_k}$
\end{tabular}
@@ -289,16 +287,16 @@
or a graphical, low-dimensional approximation of the data, the graph
is called a biplot. The graph is a biplot if the transformed scores
satisfy $x_{ij} = c \sum_k u_{ij}^* v_{jk}^*$ where $c$ is a scaling
-constant. In functions \texttt{princomp}, \texttt{prcomp} and
-\texttt{rda}, $c=1$ and the plotted scores are a biplot so that the
+constant. In functions \code{princomp}, \code{prcomp} and
+\code{rda}, $c=1$ and the plotted scores are a biplot so that the
singular values (or eigenvalues) are expressed for sites, and species
are left unscaled.
% For \texttt{Canoco 3} $c = n^{-1} \sqrt{n-1}
-% \sqrt{\sum \lambda_k}$ with negative \texttt{Canoco} scaling
+% \sqrt{\sum \lambda_k}$ with negative \proglang{Canoco} scaling
% values. All these $c$ are constants for a matrix, so these are all
% biplots with different internal scaling of species and site scores
-% with respect to each other. For \texttt{Canoco} with positive scaling
-% values and \texttt{vegan} with negative scaling values, no constant
+% with respect to each other. For \proglang{Canoco} with positive scaling
+% values and \pkg{vegan} with negative scaling values, no constant
% $c$ can be found, but the correction is dependent on species standard
% deviations $s_j$, and these scores do not define a biplot.
@@ -313,45 +311,45 @@
scores scaled by eigenvalues will have a narrower dispersion. For
graphical biplots we should be able to fix the relations of row and
column scores to be invariant against scaling of data. The solution
-in R standard function \texttt{biplot} is to scale site and species
+in R standard function \code{biplot} is to scale site and species
scores independently, and typically very differently, but plot each
-independently to fill the graph area. The solution in \texttt{Canoco} and
-and \texttt{rda} is to use proportional eigenvalues $\lambda_k / \sum
+independently to fill the graph area. The solution in \proglang{Canoco} and
+and \code{rda} is to use proportional eigenvalues $\lambda_k / \sum
\lambda_k$ instead of original eigenvalues. These proportions are
invariant with scale changes, and typically they have a nice range for
plotting two data sets in the same graph.
The \textbf{vegan} package uses a scaling constant $c = \sqrt[4]{(n-1)
\sum \lambda_k}$ in order to be able to use scaling by proportional
-eigenvalues (like in \texttt{Canoco}) and still be able to have a
-biplot scaling. Because of this, the scaling of \texttt{rda} scores is
-non-standard. However, the \texttt{scores} function lets you to set
+eigenvalues (like in \proglang{Canoco}) and still be able to have a
+biplot scaling. Because of this, the scaling of \code{rda} scores is
+non-standard. However, the \code{scores} function lets you to set
the scaling constant to any desired values. It is also possible to
have two separate scaling constants: the first for the species, and
the second for sites and friends, and this allows getting scores of
other software or R functions (Table \ref{tab:rdaconst}).
\begin{table}
- \caption{\label{tab:rdaconst} Values of the \texttt{const} argument in
+ \caption{\label{tab:rdaconst} Values of the \code{const} argument in
\textbf{vegan} to get the scores that are equal to those from
other functions and software. Number of sites (rows) is $n$,
the number of species (columns) is $m$, and the sum of all
eigenvalues is $\sum_k \lambda_k$ (this is saved as the item
- \texttt{tot.chi} in the \texttt{rda} result)}.
+ \code{tot.chi} in the \code{rda} result)}.
\begin{tabular}{lccc}
& \textbf{Scaling} &\textbf{Species constant} & \textbf{Site constant} \\
-\texttt{vegan} & any & $\sqrt[4]{(n-1) \sum \lambda_k}$ & $\sqrt[4]{(n-1) \sum \lambda_k}$\\
-\texttt{prcomp}, \texttt{princomp} & \texttt{1} & $1$ & $\sqrt{(n-1) \sum_k \lambda_k}$\\
-\texttt{Canoco 3} & \texttt{-1, -2, -3} & $\sqrt{n-1}$ & $\sqrt{n}$\\
-\texttt{Canoco 4} & \texttt{-1, -2, -3} & $\sqrt{m}$ & $\sqrt{n}$
+\pkg{vegan} & any & $\sqrt[4]{(n-1) \sum \lambda_k}$ & $\sqrt[4]{(n-1) \sum \lambda_k}$\\
+\code{prcomp}, \code{princomp} & \code{1} & $1$ & $\sqrt{(n-1) \sum_k \lambda_k}$\\
+\texttt{Canoco 3} & \code{-1, -2, -3} & $\sqrt{n-1}$ & $\sqrt{n}$\\
+\texttt{Canoco 4} & \code{-1, -2, -3} & $\sqrt{m}$ & $\sqrt{n}$
\end{tabular}
\end{table}
In this chapter, I used always centred data matrices. In principle
\textsc{svd} could be done with original, non-centred data, but
-there is no option for this in \texttt{rda}, because I think that
+there is no option for this in \code{rda}, because I think that
non-centred analysis is dubious and I do not want to encourage its use
(if you think you need it, you are certainly so good in programming
-that you can change that one line in \texttt{rda.default}). I do
+that you can change that one line in \code{rda.default}). I do
think that the arguments for non-centred analysis are often twisted,
and the method is not very good for its intended purpose, but there
are better methods for finding fuzzy classes. Normal, centred
@@ -381,8 +379,8 @@
\end{itemize}
Many computer programs for constrained ordinations give only or
primarily LC scores, following Mike Palmer's recommendation
-\cite{Palmer93}. However, functions \texttt{cca} and \texttt{rda} in
-the \texttt{vegan} package use primarily WA scores. This chapter
+\cite{Palmer93}. However, functions \code{cca} and \code{rda} in
+the \pkg{vegan} package use primarily WA scores. This chapter
explains the reasons for this choice.
Briefly, the main reasons are that
@@ -401,7 +399,7 @@
scores.
\end{itemize}
This article studies mainly the first point. The users of
-\texttt{vegan} have a choice of either LC or WA (default) scores, but
+\pkg{vegan} have a choice of either LC or WA (default) scores, but
after reading this article, I believe that most of them do not want to
use LC scores, because they are not what they were looking for in
ordination.
@@ -417,7 +415,7 @@
data(varechem)
orig <- cca(varespec ~ Al + K, varechem)
@
-Function \texttt{cca} in \texttt{vegan} uses WA scores as
+Function \code{cca} in \pkg{vegan} uses WA scores as
default. So we must specifically ask for LC scores
(Fig. \ref{fig:ccalc}).
<<a,fig=false>>=
@@ -432,7 +430,7 @@
\end{SCfigure}
What would happen to linear combinations of LC scores if we shuffle
-the ordering of sites in species data? Function \texttt{sample()} below
+the ordering of sites in species data? Function \code{sample()} below
shuffles the indices.
<<>>=
i <- sample(nrow(varespec))
@@ -514,7 +512,7 @@
proc <- procrustes(scores(tmp1, dis="lc", choi=1:14), scores(tmp2, dis="lc", choi=1:14))
max(residuals(proc))
@
-In \texttt{cca} the difference would be somewhat larger than now
+In \code{cca} the difference would be somewhat larger than now
observed \Sexpr{format.pval(max(residuals(proc)))} because site
weights used for environmental variables are shuffled with the species
data.
@@ -550,7 +548,7 @@
show where the site \emph{should} be, the WA scores shows where the
site \emph{is}.
-Function \texttt{ordispider} adds line segments to connect each WA
+Function \code{ordispider} adds line segments to connect each WA
score with the corresponding LC (Fig. \ref{fig:walcspider}).
<<a,fig=false>>=
plot(orig, display="wa", type="points")
@@ -566,7 +564,7 @@
\label{fig:walcspider}
\end{SCfigure}
This is the standard way of displaying results of discriminant
-analysis, too. Moisture classes \texttt{1} and \texttt{2} seem to be
+analysis, too. Moisture classes \code{1} and \code{2} seem to be
overlapping, and cannot be completely separated by their
vegetation. Other classes are more distinct, but there seems to be a
clear arc effect or a ``horseshoe'' despite using CCA.
@@ -577,13 +575,11 @@
independent of vegetation. If you plot them, you plot only your
environmental variables. WA scores are based on vegetation data but
are constrained to be as similar to the LC scores as only
-possible. Therefore \texttt{vegan} calls LC scores as
-\texttt{constraints} and WA scores as \texttt{site scores}, and uses
+possible. Therefore \pkg{vegan} calls LC scores as
+\code{constraints} and WA scores as \code{site scores}, and uses
primarily WA scores in plotting. However, the user makes the ultimate
choice, since both scores are available.
-
-\bibliographystyle{plain}
\bibliography{vegan}
\end{document}
Modified: pkg/vegan/inst/doc/diversity-vegan.Rnw
===================================================================
--- pkg/vegan/inst/doc/diversity-vegan.Rnw 2011-08-09 09:00:27 UTC (rev 1702)
+++ pkg/vegan/inst/doc/diversity-vegan.Rnw 2011-08-09 18:52:00 UTC (rev 1703)
@@ -1,20 +1,30 @@
% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
%\VignetteIndexEntry{Diversity analysis in vegan}
-\documentclass[a4paper,10pt]{amsart}
-\usepackage{ucs}
-\usepackage[utf8x]{inputenc}
-\usepackage[T1]{fontenc}
-\usepackage{graphicx}
+\documentclass[article,a4paper,10pt,nojss]{jss}
+%\usepackage{ucs}
+%\usepackage[utf8x]{inputenc}
+%\usepackage[T1]{fontenc}
\usepackage{sidecap}
-\setlength{\captionindent}{0pt}
-\usepackage{url}
+\usepackage{amsmath}
\title{Vegan: ecological diversity}
\author{Jari Oksanen}
-\date{$ $Id$ $
+\Abstract{
+ }
+\Keywords{diversity, Shannon, Rényi, Hill number, Tsallis,
+ rarefaction, species accumulation, beta diversity, species
+ abundance, Fisher alpha, Fisher logarithmic series, Preston
+ log-normal model, extended richness, taxonomic diversity, functional
+ divesity, species pool}
+
+%% misuse next for scm data
+\Address{$ $Id$ $
processed with vegan \Sexpr{packageDescription("vegan", field="Version")}
in \Sexpr{R.version.string} on \today}
+\Footername{This version}
+
+%% need no \usepackage{Sweave}
\begin{document}
\setkeys{Gin}{width=0.55\linewidth}
\SweaveOpts{strip.white=true}
@@ -23,12 +33,11 @@
options(width=72)
figset <- function() par(mar=c(4,4,1,1)+.1)
options(SweaveHooks = list(fig = figset))
+options("prompt" = "R> ", "continue" = "+ ")
@
-\maketitle
-\tableofcontents
-\noindent The \texttt{vegan} package has two major components:
+\noindent The \pkg{vegan} package has two major components:
multivariate analysis (mainly ordination), and methods for diversity
analysis of ecological communities. This document gives an
introduction to the latter. Ordination methods are covered in other
@@ -49,7 +58,7 @@
\section{Diversity indices}
-Function \texttt{diversity} finds the most commonly used diversity
+Function \code{diversity} finds the most commonly used diversity
indices:
\begin{align}
H &= - \sum_{i=1}^S p_i \log_b p_i & \text{Shannon--Weaver}\\
@@ -67,16 +76,16 @@
@
which finds diversity indices for all sites.
-\texttt{Vegan} does not have indices for evenness (equitability), but
+\pkg{vegan} does not have indices for evenness (equitability), but
the most common of these, Pielou's evenness $J = H'/\log(S)$ is easily
found as:
<<>>=
J <- H/log(specnumber(BCI))
@
-where \texttt{specnumber} is a simple \texttt{vegan} function to find
+where \code{specnumber} is a simple \pkg{vegan} function to find
the numbers of species.
-\texttt{Vegan} also can estimate R\'{e}nyi diversities of order $a$:
+\pkg{vegan} also can estimate R\'{e}nyi diversities of order $a$:
\begin{equation}
H_a = \frac{1}{1-a} \log \sum_{i=1}^S p_i^a
\end{equation}
@@ -92,8 +101,8 @@
@
We can really regard a site more diverse if all of its R\'{e}nyi
diversities are higher than in another site. We can inspect this
-graphically using the standard \texttt{plot} function for the
-\texttt{renyi} result (Fig. \ref{fig:renyi}).
+graphically using the standard \code{plot} function for the
+\code{renyi} result (Fig. \ref{fig:renyi}).
<<echo=false,results=hide>>=
require(lattice, quietly=TRUE)
@
@@ -224,11 +233,11 @@
make a great difference if two individuals belong to a different
species or to a different genus.
-Function \texttt{taxondive} implements indices of taxonomic diversity,
-and \texttt{taxa2dist} can be used to convert classification tables to
+Function \code{taxondive} implements indices of taxonomic diversity,
+and \code{taxa2dist} can be used to convert classification tables to
taxonomic distances either with constant or variable step lengths
between successive categories. There is no taxonomic table for the BCI
-data in \texttt{vegan}\footnote{Actually I made such a classification,
+data in \pkg{vegan}\footnote{Actually I made such a classification,
but taxonomic differences proved to be of little use in the Barro
Colorado data: they only singled out sites with Monocots (palm
trees) in the data.}
@@ -260,12 +269,12 @@
difference is evaluated only once instead of evaluating its distance
to all other species.
-Function \texttt{treedive} implements functional diversity defined as
+Function \code{treedive} implements functional diversity defined as
the total branch length in a trait dendrogram connecting all species,
but excluding the unnecessary root segments of the tree. The example
uses the taxonomic distances of the previous chapter. These are first
converted to a hierarchic clustering (which actually were their
-original form before \texttt{taxa2dist} converted them into distances)
+original form before \code{taxa2dist} converted them into distances)
<<>>=
tr <- hclust(taxdis, "aver")
mod <- treedive(dune, tr)
@@ -275,7 +284,7 @@
Diversity indices may be regarded as variance measures of species
abundance distribution. We may wish to inspect abundance
-distributions more directly. \texttt{Vegan} has functions for
+distributions more directly. \pkg{vegan} has functions for
Fisher's log-series and Preston's log-normal models, and in addition
several models for species abundance distribution.
@@ -305,11 +314,11 @@
\end{SCfigure}
We already saw $\alpha$ as a diversity index. Now we also obtained
estimate of standard error of $\alpha$ (these also are optionally
-available in \texttt{fisherfit}). The standard errors are based on
+available in \code{fisherfit}). The standard errors are based on
the second derivatives (curvature) of log-likelihood at the solution
of $\alpha$. The distribution of $\alpha$ is often non-normal
and skewed, and standard errors are of not much use. However,
-\texttt{fisherfit} has a \texttt{profile} method that can be used to
+\code{fisherfit} has a \code{profile} method that can be used to
inspect the validity of normal assumptions, and will be used in
calculations of confidence intervals from profile deviance:
<<>>=
@@ -324,10 +333,10 @@
at the left.
There are two alternative functions for the log-normal model:
-\texttt{prestonfit} and \texttt{prestondistr}. Function
-\texttt{prestonfit} uses traditionally binning approach, and is burdened
+\code{prestonfit} and \code{prestondistr}. Function
+\code{prestonfit} uses traditionally binning approach, and is burdened
with arbitrary choices of binning limits and treatment of ties.
-Function \texttt{prestondistr} directly
+Function \code{prestondistr} directly
maximizes truncated log-normal likelihood without binning data, and it
is the recommended alternative. Log-normal models usually fit poorly
to the BCI data, but here our random plot (number \Sexpr{k}):
@@ -342,7 +351,7 @@
species. These are known as ranked abundance
distribution curves, species abundance curves, dominance--diversity
curves or Whittaker plots.
-Function \texttt{radfit} fits some of the most popular models using
+Function \code{radfit} fits some of the most popular models using
maximum likelihood estimation:
\begin{align}
\hat a_r &= \frac{N}{S} \sum_{k=r}^S \frac{1}{k} &\text{brokenstick}\\
@@ -359,7 +368,7 @@
$\beta$ and $c$ are the estimated parameters in each model.
It is customary to define the models for proportions $p_r$ instead of
-abundances $a_r$, but there is no reason for this, and \texttt{radfit}
+abundances $a_r$, but there is no reason for this, and \code{radfit}
is able to work with the original abundance data. We have count data,
and the default Poisson error looks appropriate, and our example data
set gives (Fig. \ref{fig:rad}):
@@ -376,17 +385,17 @@
\label{fig:rad}
\end{SCfigure}
-Function \texttt{radfit} compares the models using alternatively
+Function \code{radfit} compares the models using alternatively
Akaike's or Schwartz's Bayesian information criteria. These are based
on log-likelihood, but penalized by the number of estimated
parameters. The penalty per parameter is $2$ in \textsc{aic}, and
$\log S$ in \textsc{bic}. Brokenstick is regarded as a null model and
-has no estimated parameters in \texttt{vegan}. Preemption model has
+has no estimated parameters in \pkg{vegan}. Preemption model has
one estimated parameter ($\alpha$), log-normal and Zipf models two
($\mu, \sigma$, or $\hat p_1, \gamma$, resp.), and Zipf--Mandelbrot
model has three ($c, \beta, \gamma$).
-Function \texttt{radfit} also works with data frames, and fits models
+Function \code{radfit} also works with data frames, and fits models
for each site. It is curious that log-normal model rarely is the
choice, although it generally is regarded as the canonical model, in
particular in data sets like Barro Colorado tropical forests.
@@ -465,7 +474,7 @@
Subtraction of one means that $\beta = 0$ when there are no excess
species or no heterogeneity between sites. For this index, no specific
functions are needed, but this index can be easily found with the help
-of \texttt{vegan} function \texttt{specnumber}:
+of \pkg{vegan} function \code{specnumber}:
<<>>=
ncol(BCI)/mean(specnumber(BCI)) - 1
@
@@ -482,7 +491,7 @@
\beta = \frac{a+b+c}{(2a+b+c)/2} - 1 = \frac{b+c}{2a+b+c}
\end{equation}
This is the S{\o}rensen index of dissimilarity, and it can be found
-for all sites using \texttt{vegan} function \texttt{vegdist} with
+for all sites using \pkg{vegan} function \code{vegdist} with
binary data:
<<>>=
beta <- vegdist(BCI, binary=TRUE)
@@ -491,7 +500,7 @@
There are many other definitions of beta diversity in addition to
eq. \ref{eq:beta}. All commonly used indices can be found using
-\texttt{betadiver}. The indices in \texttt{betadiver} can be referred
+\code{betadiver}. The indices in \code{betadiver} can be referred
to by subscript name, or index number:
<<>>=
betadiver(help=TRUE)
@@ -511,7 +520,7 @@
islands can be regarded as subsets of the same community, indicating
that we really should talk about gradient differences if $z > 0.3$. We
can find the value of $z$ for a pair of plots using function
-\texttt{betadiver}:
+\code{betadiver}:
<<>>=
z <- betadiver(BCI, "z")
quantile(z)
@@ -519,7 +528,7 @@
The size $X$ and parameter $c$ cancel out, and the index gives the
estimate $z$ for any pair of sites.
-Function \texttt{betadisper} can be used to analyse beta diversities
+Function \code{betadisper} can be used to analyse beta diversities
with respect to classes or factors. There is no such classification
available for the Barro Colorado Island data, and the example studies
beta diversities in the management classes of the dune meadows
@@ -546,15 +555,15 @@
Species accumulation models indicate that not all species were seen in
any site. These unseen species also belong to the species pool.
-Functions \texttt{specpool} and \texttt{estimateR} implement some
+Functions \code{specpool} and \code{estimateR} implement some
methods of estimating the number of unseen species. Function
-\texttt{specpool} studies a collection of sites, and
-\texttt{estimateR} works with counts of individuals, and can be used
+\code{specpool} studies a collection of sites, and
+\code{estimateR} works with counts of individuals, and can be used
with a single site. Both functions assume that the number of unseen
species is related to the number of rare species, or species seen only
once or twice.
-Function \texttt{specpool} implements the following models to estimate
+Function \code{specpool} implements the following models to estimate
the pool size $S_p$:
\begin{align}
S_p &= S_o + \frac{f_1^2}{2 f_2} & \text{Chao}\\
@@ -582,7 +591,7 @@
s^2 = \left(\sum_{i=1}^N r_i^2 - \frac{f_1}{N}\right) \frac{N-1}{N}
\end{equation}
Variance of the second-order jackknife is not evaluated in
-\texttt{specpool} (but contributions are welcome).
+\code{specpool} (but contributions are welcome).
For the variance of bootstrap estimator, it is practical to define a
new variable $q_i = (1-p_i)^N$ for each species:
\begin{equation}
@@ -608,11 +617,11 @@
\subsection{Pool size from a single site}
-The \texttt{specpool} function needs a collection of sites, but there
+The \code{specpool} function needs a collection of sites, but there
are some methods that estimate the number of unseen species for each
single site. These functions need counts of individuals, and species
seen only once or twice, or other rare species, take the place of
-species with low frequencies. Function \texttt{estimateR} implements
+species with low frequencies. Function \code{estimateR} implements
two of these methods:
<<>>=
estimateR(BCI[k,])
@@ -647,11 +656,11 @@
\end{equation}
where $S_\mu$ is the modal height or the expected number of species at
maximum (at $\mu$), and $\sigma$ is the width. Function
-\texttt{veiledspec} estimates this integral from a model fitted either
-with \texttt{prestondistr} or \texttt{prestonfit}, and fits the latter
+\code{veiledspec} estimates this integral from a model fitted either
+with \code{prestondistr} or \code{prestonfit}, and fits the latter
if raw site data are given. Log-normal model fits badly, and
-\texttt{prestonfit} is particularly poor. Therefore the following
-explicitly uses \texttt{prestondistr}, although this also may fail:
+\code{prestonfit} is particularly poor. Therefore the following
+explicitly uses \code{prestondistr}, although this also may fail:
<<>>=
veiledspec(prestondistr(BCI[k,]))
veiledspec(BCI[k,])
@@ -666,7 +675,7 @@
a species. The probability for each species at each site is assessed
from other species occurring on the site.
-Function \texttt{beals} implement Beals smoothing:
+Function \code{beals} implement Beals smoothing:
<<>>=
smo <- beals(BCI)
@
Modified: pkg/vegan/inst/doc/intro-vegan.Rnw
===================================================================
--- pkg/vegan/inst/doc/intro-vegan.Rnw 2011-08-09 09:00:27 UTC (rev 1702)
+++ pkg/vegan/inst/doc/intro-vegan.Rnw 2011-08-09 18:52:00 UTC (rev 1703)
@@ -1,24 +1,33 @@
% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
%\VignetteIndexEntry{Introduction to ordination in vegan}
-\documentclass[a4paper,10pt]{amsart}
-\usepackage{ucs}
-\usepackage[utf8x]{inputenc}
-\usepackage[T1]{fontenc}
-\usepackage{graphicx}
+\documentclass[article,10pt,nojss]{jss}
+%\usepackage{ucs}
+%\usepackage[utf8x]{inputenc}
+%\usepackage[T1]{fontenc}
\usepackage{sidecap}
-\setlength{\captionindent}{0pt}
-\usepackage{url}
+\usepackage{amsmath}
\renewcommand{\floatpagefraction}{0.8}
-\title{Vegan: an introduction to ordination}
+\title{Vegan: an introduction to ordination}
+
\author{Jari Oksanen}
-\date{$ $Id$ $
+\Abstract{ }
+
+\Keywords{ordination, correspondence analysis, non-metric
+ multidimensional scaling, CCA, RDA, NMDS, fitted environmental
+ vector, fitted environmental surface, permutation tests}
+
+%% misuse of the address field for revision data
+\Address{$ $Id$ $
processed with vegan
\Sexpr{packageDescription("vegan", field="Version")}
in \Sexpr{R.version.string} on \today}
+\Footername{This version}
+
+%% need no \usepackage{Sweave}
\begin{document}
\setkeys{Gin}{width=0.55\linewidth}
@@ -28,35 +37,32 @@
options(width=72)
figset <- function() par(mar=c(4,4,1,1)+.1)
options(SweaveHooks = list(fig = figset))
+options("prompt" = "R> ", "continue" = "+ ")
@
-\maketitle
-\tableofcontents
-
-
-\noindent \texttt{Vegan} is a package for community ecologists. This
+\noindent \pkg{vegan} is a package for community ecologists. This
documents explains how the commonly used ordination methods can be
-done in \texttt{vegan}. The document only is a very basic
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/vegan -r 1703
More information about the Vegan-commits
mailing list