[Mattice-commits] r113 - pkg/misc

Tue Dec 30 04:06:32 CET 2008

Author: andrew_hipp
Date: 2008-12-30 04:06:31 +0100 (Tue, 30 Dec 2008)
New Revision: 113

Modified:
   pkg/misc/maticce.Rnw
Log:
more on vignette

Modified: pkg/misc/maticce.Rnw
===================================================================

--- pkg/misc/maticce.Rnw	2008-12-29 21:29:04 UTC (rev 112)
+++ pkg/misc/maticce.Rnw	2008-12-30 03:06:31 UTC (rev 113)
@@ -46,7 +46,7 @@
 
 \section{Extracting information from trees}
 
-Three functions to extract information from an \code{ouchtree} object:
+Three functions are available to extract information from an \code{ouchtree} object:
 
 \begin{itemize}
   \item \code{isMonophyletic}: returns a \code{T} or \code{F} depending on whether the taxa identified are monophyletic on the tree provided
@@ -74,349 +74,30 @@
 
 \section{Painting regimes}
 
-XXXXXXXXXXXXXXXXX
+Two functions are available for painting selective regimes that may be used in the \code{hansen} function of \pkg{ouch}:
 
-<<randtree1,fig=FALSE>>=
-library(ape)
-# Make a random tree with 10 tips
-rand_tree <- rcoal(10)
-plot(rand_tree)
-@ 
+\begin{itemize}
+  \item \code{paintBranches}: returns the single regime for changes occuring at all specified nodes
+  \item \code{regimeVectors}: returns all possible regimes for specified nodes, up to a maximum of \code{maxNodes} + 1 optima
+  \item \code{regimeMaker}: returns regimes defined by a matrix, with each row specifying which nodes the optimz change at
+\end{itemize}
 
-However, typing \code{?plot} still takes us to the default \code{plot} help.  We have to type \code{plot.phylo} to find what we are looking for.  This is because \code{S3} generics are simply functions with a dot and the class name added.  
+The \code{paintBranches} function is typically called from within \code{regimeVectors}, but it can be called separately. Nodes can be designated by number or taxa; the function assumes the latter only if it receives a list to evaluate instead of a vector.
 
-The \code{S4} generic system is too complicated to describe here, but doesn't include the same dot notation.  As a result \code{?plot.phylo4} doesn't work, \code{R} does, however, find the right plotting function.
+<<paintingOne,fig=FALSE>>=
+library(ouch)
+ou2 <- paintBranches(list(nodes[[2]]), otree)
+@
 
-<<convtree,fig=FALSE>>=
-library(phylobase)
-# convert rand_tree to a phylo4 object
-rand_p4_tree <- as(rand_tree, "phylo4")
-plot(rand_p4_tree)
-@ 
+The regime can be used directly in a call to \code{hansen} or the \code{plot} method for an \code{ouchtree} object.
 
-All fine and good, but how to we find out about all the great features of the \code{phylobase} plotting function?  \code{R} has two nifty ways to find it, the first is to simply put a question mark in front of the whole call:
-
-\begin{verbatim}
-	> ?plot(rand_p4_tree)
-\end{verbatim}
-
-\code{R} looks at the class of the \code{rand\_p4\_tree} object and takes us to the correct help file (note: this only works with \code{S4} objects).  The second ways is handy if you already know the class of your object, or want to compare to generics for different classes:
-
-\begin{verbatim}
-	> method?plot("phylo4")
-\end{verbatim}
-
-More information about how \code{S4} documentation works 
-can be found in the methods package, by running the following command.
-
-<<doc,eval=FALSE>>=
-help('Documentation', package = "methods") 
-@ 
-
-\section{Trees without data}
-
-You can start with a tree --- an object of
-class \code{phylo} from the \code{ape} package
-(e.g., read in using the \code{read.tree()} or \code{read.nexus()}
-functions), and convert it to a \code{phylo4} object.
-
-For example, load the raw \emph{Geospiza} data:
-<<geodata>>=
-data(geospiza_raw)
-names(geospiza_raw)
-@ 
-
-Convert the \code{S3} tree to a \code{S4 phylo4} object using the \code{as()} function:
-<<convgeodata>>=
-library(phylobase)
-g1 <- as(geospiza_raw$tree,"phylo4")
-g1
-@ 
-
-Note that the nodes and edges are given default names if the tree contains no node or edge names.
-
-The \code{summary} method gives a little extra information, including information on branch lengths:
-<<sumgeodata>>=
-summary(g1)
-@ 
-
-Print tip labels:
-<<tiplabelgeodata>>=
-labels(g1)
-@ 
-
-Print internal node labels (R automatically assigns values):
-<<nodelabelgeodata>>=
-nodeLabels(g1)
-@ 
-
-Print edge labels (also automatically assigned):
-<<edgelabelgeodata>>=
-edgeLabels(g1)
-@ 
-
-Is it rooted?
-<<rootedgeodata>>=
-isRooted(g1)
+<<plotou2, fig=TRUE>>=
+plot(otree, regimes = ou2)
 @
 
-Which node is the root?
-<<rootnodegeodata>>=
-rootNode(g1)
-@ 
+\section{Batch analyses}
 
-Does it have any polytomies?
-<<polygeodata>>=
-hasPoly(g1)
-@ 
 
-Does it have branch lengths?
-<<hasbrlengeodata>>=
-hasEdgeLength(g1)
-@ 
 
-You can modify labels and other aspects
-of the tree --- for example,
-<<modlabelsgeodata>>=
-labels(g1) <- tolower(labels(g1))
-@ 
 
-\section{Trees with data}
-
-The \code{phylo4d} class matches trees with data.
-(\textbf{fixme: need to be able to use ioNCL!})
-or combine it with a data frame to make a \code{phylo4d} (tree-with-data)
-object.
-
-Now we'll take the \emph{Geospiza} data from \verb+geospiza_raw$data+
-and merge it with the tree.  However, since \emph{G. olivacea} is included
-in the tree but not in the data set, we will initially run into some trouble:
-
-<<geomergedata,eval=FALSE>>=
-g2 <- phylo4d(g1,geospiza_raw$data)
-@ 
-
-gives
-<<geomergeerr1,echo=FALSE>>=
-err1 <- try(g2 <- phylo4d(g1,geospiza_raw$data),silent=TRUE)
-cat(as.character(err1))
-@ 
-
-We have two problems --- the first is that we forgot to lowercase
-the labels on the data to match the tip labels:
-
-<<geomergenames>>=
-gdata <- geospiza_raw$data
-row.names(gdata) <- tolower(row.names(gdata))
-@ 
-
-To deal with the second problem
-(missing data for \emph{G. olivacea}), we have a few choices.
-The easiest is to use \code{missing.tip.data="OK"}
-to allow R to create the new object:
-<<geomerge2>>=
-g2 <- phylo4d(g1,gdata,missing.tip.data="OK")
-@ 
-(setting \code{missing.tip.data} to \code{"warn"}
-would create the new object but print a warning).
-
-Another way to deal with this would be to 
-use \code{prune()} to drop
-the offending tip from the tree first:
-<<geomerge3,results=hide>>=
-g1B <- prune(g1,"olivacea")
-phylo4d(g1B,gdata)
-@ 
-
-You can summarize the new object:
-<<geomergesum>>=
-summary(g2)
-@ 
-
-Or use \code{tdata()} to extract the data (i.e., \code{tdata(g2)}). By default, \code{tdata()} will retrieve tip data, but you can also get internal node data only (\code{tdata(tree,"node")}) or --- if the tip and node data have the same format --- all the data combined (\code{tdata(tree,"allnode")}).
-
-Plotting calls \code{plot.phylog} from the \code{ade4} package.
-
-If you want to plot the data (e.g. for checking the input), \code{plot(tdata(g2))} will create the default plot for the data --- in this case, since it is a data frame [\textbf{this may change in future versions but should remain transparent}] this will be a \code{pairs} plot of the data.
-
-\section{Subsetting}
-
-The \code{subset} command offers a variety of ways of extracting portions of a \code{phylo4} or \code{phylo4d} tree, keeping any tip/node data consistent.
-
-\begin{description}
-\item[tips.include]{give a vector of tips (names or numbers) to retain}
-\item[tips.exclude]{give a vector of tips (names or numbers) to drop}
-\item[mrca]{give a vector of node or tip names or numbers; extract the clade containing these taxa}
-\item[node.subtree]{give a node (name or number); extract the subtree starting from this node}
-\end{description}
-
-Different ways to extract the \emph{fuliginosa}-\emph{scandens}
-clade:
-<<geoextract,results=hide>>=
-subset(g2,tips.include=c("fuliginosa","fortis","magnirostris",
-            "conirostris","scandens"))
-subset(g2,node.subtree="N07")
-subset(g2,mrca=c("scandens","fortis"))
-@ 
-
-One could drop the clade by  doing
-<<geodrop,results=hide>>=
-subset(g2,tips.exclude=c("fuliginosa","fortis","magnirostris",
-            "conirostris","scandens"))
-subset(g2,tips.exclude=names(descendants(g2,MRCA(g2,c("difficilis","fortis")))))
-@ 
-
-Another approach is to pick the subtree graphically, by plotting the tree and using \code{identify}, which returns the identify of the node you click on with the mouse.
-
-<<geoident,eval=FALSE>>=
-plot(g1)
-n1 <- identify(g1)
-subset(g2,node.subtree=n1)
-@ 
-
-\section{Tree-walking}
-
-\code{getnodes},
-\code{children}, \code{parent},
-\code{descendants}, \code{ancestors},
-\code{siblings},
-\code{MRCA} \ldots
-generally take a \code{phylo4} object, a node
-(specified by number or name) and return a named
-vector of node numbers.
-
-\section{multiPhylo classes}
-
-\section{Examples}
-
-\subsection{Constructing a Brownian motion trait simulator}
-
-This section will describe two (?) ways of constructing
-a simulator that generates trait values for extant species
-(tips) given a tree with branch lengths, assuming a model
-of Brownian motion.
-
-\subsubsection{the easy way}
-
-We can use the \code{vcv.phylo()} command from
-\code{ape} to construct the variance-covariance
-matrix of the tip traits, after which it's easy
-to use \code{mvrnorm} from the \code{MASS} package
-to generate a set of multivariate normally distributed
-values for the tips. (A benefit of this approach is
-that we can very quickly generate a very large
-number of replicates.)
-This example illustrates a common feature of
-working with \code{phylobase} --- combining tools from
-several different packages to operate on phylogenetic
-trees with data. 
-
-We start with a randomly generated tree using
-\code{rcoal()} from \code{ape} to generate the
-tree topology and branch lengths:
-<<rtree2>>=
-set.seed(1001)
-tree <- rcoal(12)
-@ 
-
-Next we generate the phylogenetic variance-covariance
-matrix (\code{ape::vcv.phylo}) and pick a single set
-of traits (\code{MASS:mvrnorm}).  Conveniently, the
-tip names of the original tree
-are inherited consistently by the variance-covariance
-matrix and the trait matrix:
-<<vcvphylo>>=
-vmat <- vcv.phylo(tree,cor=TRUE)
-library(MASS)
-trvec <- mvrnorm(1,mu=rep(0,12),Sigma=vmat)
-@ 
-
-The last step (easy) is to create the \code{phylo4d}
-object and plot it:
-<<plotvcvphylo,fig=TRUE>>=
-treed <- phylo4d(tree,tip.data=as.data.frame(trvec))
-plot(treed)
-@ 
-
-\subsubsection{The hard way?}
-
-Find root, traverse tree:
-
-
-% ========================================
-% = Table of commands, worth the effort? =
-% ========================================
-% \begin{tabular}{>{\tt}ll}
-% \hline
-% \rm Method & Description\\
-% \hline
-% tdata & Retrieve tip data\\
-% plot & plot tree with data if present\\
-% \hline
-% \end{tabular}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%%% Appendices %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-
-\appendix
-\section{Definitions/slots}
-
-This section details the internal structure of the \code{phylo4}, \code{multiphylo4}, \code{phylo4d}, and \code{multiphylo4d} classes.  The basic building blocks of these classes are the \code{phylo4} object and a dataframe.  The \code{phylo4} tree format is largely similar to the one used by \code{phylo} class in the package \code{ape} \footnote{\url{http://ape.mpl.ird.fr/}}.
-
-
-\subsection{phylo4}
-Like \code{phylo}, the main components of
-the \code{phylo4} class are:
-\begin{description}
-\item[edge]{an $N \times 2$ matrix of integers,
-  where the first column \ldots}
-\item[edge.length]{numeric list of edge lengths
-(length $N$ or empty)}
-\item[Nnode]{integer, number of nodes}
-\item[tip.label]{character vector of tip labels (required)}
-\item[node.label]{character vector of node labels (maybe empty)}
-\item[root.edge]{integer defining root edge (maybe NA)}
-\end{description}
-
-We have defined basic methods for \code{phylo4}:\code{show}, \code{print} (copied from \code{print.phylo} in\code{ape}), and a variety of accessor functions (see help files). \code{summary} does not seem to be terribly useful in the context of a ``raw'' tree, because there is not much to compute: \textbf{end users?}
-
-Print method: add information about (ultrametric, scaled, polytomies (zero-length or structural))?
-
-\subsection{phylo4d}
-
-The \code{phylo4d} class extends \code{phylo4} with data.  Tip data, (internal) node data, and edge data are stored separately, but can be retrieved together or separately with \code{tdata(x,"tip")} or \code{tdata(x,"all")}.
-
-\textbf{edge data can also be included --- is this
-useful/worth keeping?}
-
-\subsection{multiphylo4}
-
-\section{Validity checking}
-
-\begin{itemize}
-\item number of rows of edge matrix ($N$) == length of edge-length vector (if $>0$)
-\item (number of tip labels)+(nNode)-1 == $N$
-\item data matrix must have row names
-\item row names must match tip labels (if not, spit out mismatches)
-\end{itemize}
- 
-Default node labels:
-
-\section{Hacks/backward compatibility}
-
-There is a way to hack the \verb+$+ operator so that it would provide backward compatibility with code that is extracting internal elements of a \code{phylo4}. The basic recipe is: 
-
-<<eval=FALSE>>=
-setMethod("$","phylo4",function(x,name) { attr(x,name)})
-@ 
-
-but this has to be hacked slightly to intercept calls to elements that might be missing.  For example, \code{ape} detects whether log-likelihood, root edges, node labels, etc. are missing by testing whether they are \code{NULL}, whereas missing items are represented in \code{phylo4} by zero-length vectors in the slots (or \code{NA} for the root edge) --- so we need code like 
-<<eval=FALSE>>=
-if(!hasNodeLabels(x)) NULL else x at node.label
-@ 
-to handle these cases.
-
-
 \end{document}