[Phylobase-devl] phylo4: tabulate() requires nbins argument
Ben Bolker
bolker at ufl.edu
Wed May 21 18:02:41 CEST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
~ OK.
~ It's easy enough for me to make these changes, *but* I think we
need to step back for a minute here.
~ We inherit much of our code and conventions from
Emmanuel Paradis's "phylo" objects.
http://ape.mpl.ird.fr/misc/FormatTreeR_4Dec2006.pdf
is very informative: in particular, it says that
for well-formed phylo objects the maximum node numbers are always
internal nodes and always have degree > 0 (see below) -- so your problem
wouldn't arise.
~ Now, I will admit that we do not currently prevent this; we should
(1) decide whether we want to accept all of these rules for phylo4
objects as well (I would like to eliminate rule b below, to
allow "singleton" nodes); (2) add these rules to phylo4 object
checking so that they can't happen.
~ Thoughts, anyone?
~ Ben
===========================
Definition of the Class "phylo"
An object of class "phylo" is a list with, at least, the following
mandatory elements:
1. A numeric matrix named edge with two columns and as many rows as
there are branches in the tree;
2. A character vector of length n named tip.label with the labels of the
tips;
3. A numeric value named Nnode giving the number of (internal) nodes;
4. An attribute class equal to "phylo".
In the matrix edge, each branch is coded by the nodes it connects: tips
are coded 1, . . . , n, and internal nodes are coded n + 1, . . . , n +
m (n + 1 is the root). Both series are numbered with no gaps. The matrix
edge has the following properties:
a· The first column has only values > n (thus, values n appear only in
the second column).
b· All nodes appear in the first column at least twice.
c· The number of occurrences of a node in the first column is related to
the nature of the node: twice if it is dichotomous (i.e., of degree 3),
three times if it is trichotomous (degree 4), and so on.
d· All elements, except the root n + 1, appear once in the second column
(only if the tree has no reticulation). This representation is used for
rooted and unrooted trees. For the latter, the position of the root is
arbitrary.
~ [I added labels a-d]
Aaron Mackey wrote:
| err, wait, sorry. it's not nrows() we want, but max(edge) ...
|
| -Aaron
|
| On Wed, May 21, 2008 at 11:35 AM, Ben Bolker <bolker at zoology.ufl.edu>
wrote:
|
|>
|>
|> Aaron Mackey wrote:
|>
|>> I don't have functional SVN access at the moment, otherwise I'd do this
|>> myself. But essentially, calls to "tabulate()" need to define the
number
|>> of
|>> bins explicitly, otherwise problems occur. For example:
|>>
|>> edge
|>> [1,] 1 3
|>> [2,] 1 2
|>> [3,] 3 4
|>> [4,] 3 7
|>> [5,] 4 5
|>> [6,] 4 6
|>> [7,] 7 8
|>> [8,] 7 9
|>>
|>>> tabulate(edge[,1])
|>>>
|>> [1] 2 0 2 2 0 0 2
|>>
|>>> tabulate(edge[,1], nbins=dim(edge)[1])
|>>>
|>> [1] 2 0 2 2 0 0 2 0
|>>
|>> -Aaron
|>>
|>>
|> I grepped for "tabulate". Are you recommending that
|> we change all of these usages as above?
|> (Is it worth defining a "tabedge" function
|>
|> tabedge <- function(object,i) {
|> tabulate(edges(object[,i]), nbins=nrow(edges(object)))
|> }
|>
|> to replace most of these, or is that just too complicated?
|>
|> Ben
|>
|>
|> 1 checkdata.R: nAncest <- tabulate(edges(object)[, 2])
|> 2 class-phylo4.R: ntips <- sum(tabulate(edge[, 1]) == 0)
|> 3 class-phylo4.R: nnodes <- sum(tabulate(edge[, 1]) > 0)
|> 4 methods-phylo4.R: tabulate(edges(x)[, 1])[nTips(x) + 1] <= 2
|> 5 methods-phylo4.R: temp <- tabulate(E[,1])
|> 6 treestruc.R: degree <- tabulate(edges(object)[, 1])
|> 7 treestruc.R: degree <- tabulate(edges(object)[, 1])
|> 8 treestruc.R:# isTips <- (tabulate(x at edge[,1]) == 0)
|> 9 treestruc.R:# res <- (tabulate(x at edge[,1]) > 2)
|>
|>
|>
|>
|>
|
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFINEedc5UpGjwzenMRArZWAJ9UaRMyNjDS77YSiUx4FAZfitqJQwCgoBX4
yN+fVxyrLTbir7wZLvpZ32w=
=XNte
-----END PGP SIGNATURE-----
More information about the Phylobase-devl
mailing list