Thanks, this is useful info (that should probably get included in the docs for the phylo4() constructor). Turns out I'll have to conform to this spec to get many other things working as well, so (as you say), the requirement for "nbins" in tabulate() goes away with "proper" edge matrices.<br>
<br>Thanks again,<br><br>-Aaron<br><br><div class="gmail_quote">On Wed, May 21, 2008 at 12:02 PM, Ben Bolker <<a href="mailto:bolker@ufl.edu">bolker@ufl.edu</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
-----BEGIN PGP SIGNED MESSAGE-----<br>
Hash: SHA1<br>
<br>
~ OK.<br>
<br>
~ It's easy enough for me to make these changes, *but* I think we<br>
need to step back for a minute here.<br>
<br>
~ We inherit much of our code and conventions from<br>
Emmanuel Paradis's "phylo" objects.<br>
<br>
<a href="http://ape.mpl.ird.fr/misc/FormatTreeR_4Dec2006.pdf" target="_blank">http://ape.mpl.ird.fr/misc/FormatTreeR_4Dec2006.pdf</a><br>
<br>
is very informative: in particular, it says that<br>
for well-formed phylo objects the maximum node numbers are always<br>
internal nodes and always have degree > 0 (see below) -- so your problem<br>
wouldn't arise.<br>
<br>
~ Now, I will admit that we do not currently prevent this; we should<br>
(1) decide whether we want to accept all of these rules for phylo4<br>
objects as well (I would like to eliminate rule b below, to<br>
allow "singleton" nodes); (2) add these rules to phylo4 object<br>
checking so that they can't happen.<br>
<br>
~ Thoughts, anyone?<br>
<br>
~ Ben<br>
<br>
===========================<br>
Definition of the Class "phylo"<br>
<br>
An object of class "phylo" is a list with, at least, the following<br>
mandatory elements:<br>
<br>
1. A numeric matrix named edge with two columns and as many rows as<br>
there are branches in the tree;<br>
2. A character vector of length n named tip.label with the labels of the<br>
tips;<br>
3. A numeric value named Nnode giving the number of (internal) nodes;<br>
4. An attribute class equal to "phylo".<br>
<br>
In the matrix edge, each branch is coded by the nodes it connects: tips<br>
are coded 1, . . . , n, and internal nodes are coded n + 1, . . . , n +<br>
m (n + 1 is the root). Both series are numbered with no gaps. The matrix<br>
edge has the following properties:<br>
<br>
a· The first column has only values > n (thus, values n appear only in<br>
the second column).<br>
b· All nodes appear in the first column at least twice.<br>
c· The number of occurrences of a node in the first column is related to<br>
the nature of the node: twice if it is dichotomous (i.e., of degree 3),<br>
three times if it is trichotomous (degree 4), and so on.<br>
d· All elements, except the root n + 1, appear once in the second column<br>
(only if the tree has no reticulation). This representation is used for<br>
rooted and unrooted trees. For the latter, the position of the root is<br>
arbitrary.<br>
<br>
~ [I added labels a-d]<div><div></div><div class="Wj3C7c"><br>
<br>
<br>
<br>
<br>
Aaron Mackey wrote:<br>
| err, wait, sorry. it's not nrows() we want, but max(edge) ...<br>
|<br>
| -Aaron<br>
|<br>
| On Wed, May 21, 2008 at 11:35 AM, Ben Bolker <<a href="mailto:bolker@zoology.ufl.edu" target="_blank">bolker@zoology.ufl.edu</a>><br>
wrote:<br>
|<br>
|><br>
|><br>
|> Aaron Mackey wrote:<br>
|><br>
|>> I don't have functional SVN access at the moment, otherwise I'd do this<br>
|>> myself. But essentially, calls to "tabulate()" need to define the<br>
number<br>
|>> of<br>
|>> bins explicitly, otherwise problems occur. For example:<br>
|>><br>
|>> edge<br>
|>> [1,] 1 3<br>
|>> [2,] 1 2<br>
|>> [3,] 3 4<br>
|>> [4,] 3 7<br>
|>> [5,] 4 5<br>
|>> [6,] 4 6<br>
|>> [7,] 7 8<br>
|>> [8,] 7 9<br>
|>><br>
|>>> tabulate(edge[,1])<br>
|>>><br>
|>> [1] 2 0 2 2 0 0 2<br>
|>><br>
|>>> tabulate(edge[,1], nbins=dim(edge)[1])<br>
|>>><br>
|>> [1] 2 0 2 2 0 0 2 0<br>
|>><br>
|>> -Aaron<br>
|>><br>
|>><br>
|> I grepped for "tabulate". Are you recommending that<br>
|> we change all of these usages as above?<br>
|> (Is it worth defining a "tabedge" function<br>
|><br>
|> tabedge <- function(object,i) {<br>
|> tabulate(edges(object[,i]), nbins=nrow(edges(object)))<br>
|> }<br>
|><br>
|> to replace most of these, or is that just too complicated?<br>
|><br>
|> Ben<br>
|><br>
|><br>
|> 1 checkdata.R: nAncest <- tabulate(edges(object)[, 2])<br>
|> 2 class-phylo4.R: ntips <- sum(tabulate(edge[, 1]) == 0)<br>
|> 3 class-phylo4.R: nnodes <- sum(tabulate(edge[, 1]) > 0)<br>
|> 4 methods-phylo4.R: tabulate(edges(x)[, 1])[nTips(x) + 1] <= 2<br>
|> 5 methods-phylo4.R: temp <- tabulate(E[,1])<br>
|> 6 treestruc.R: degree <- tabulate(edges(object)[, 1])<br>
|> 7 treestruc.R: degree <- tabulate(edges(object)[, 1])<br>
|> 8 treestruc.R:# isTips <- (tabulate(x@edge[,1]) == 0)<br>
|> 9 treestruc.R:# res <- (tabulate(x@edge[,1]) > 2)<br>
|><br>
|><br>
|><br>
|><br>
|><br>
|<br>
<br></div></div>
-----BEGIN PGP SIGNATURE-----<br>
Version: GnuPG v1.4.6 (GNU/Linux)<br>
Comment: Using GnuPG with Mozilla - <a href="http://enigmail.mozdev.org" target="_blank">http://enigmail.mozdev.org</a><br>
<br>
iD8DBQFINEedc5UpGjwzenMRArZWAJ9UaRMyNjDS77YSiUx4FAZfitqJQwCgoBX4<br>
yN+fVxyrLTbir7wZLvpZ32w=<br>
=XNte<br>
-----END PGP SIGNATURE-----<br>
</blockquote></div><br>