[Phylobase-devl] labeling order

Ben Bolker bolker at ufl.edu
Sat Dec 27 18:20:46 CET 2008


 Hmmm.

tibo wrote:

> Here are some opinions, in case it is still time to express some (after
> the battle). I recognize most of them consist in encouraging not to
> change data formats as much as possible -- basically because I have now
> a working package based on our current data representation. Also, from
> what I and some of my colleagues working with phylobase have experienced
> so far, it works pretty well and in a sensible way.

  I think the attempt is to make things consistent, which they weren't
entirely before.  I agree that changing things as little as possible is
a good idea!

>>   Hmmm.
>>
>>   For the record, here's Steve's statement:
>>
>> SWK - This is crucial and we should decide soon, needs to be sorted
>> out for 1.5. I think that many of the problems we're having with
>> labels and reordering are due to the fact that until now we treated
>> nodes and edges as interchangable. i.e. we had node labels in edge
>> matrix order, but these labels should really be associated with
>> nodes, not with edges. 
> I could not agree more.
>> This assumption caused things to break once edges
>>  and nodes were not equivalent (now that root edge is in the edge matrix
>> and we allow edge matrix reordering, or for unrooted trees). I
>> think we need to be very clear about whether methods are actually
>> operating on nodes or edges.
>> I suggest that edge, edge.labels and edge.lengths (branch lengths)
>> are in 'edge' order. 
> I can hardly see how it would make sense otherwise. All information
> provided for a given item should be sorted according to this item. Tips
> labels should be in the tip order, node (internal nodes) label sorted as
> node numbers, etc.

  Here's where it gets tricky.  Of course it's sensible for edge
lengths and labels to be in edge matrix order ... for the others
(tip labels, node labels), what do you mean by "tip order", "node numbers"?

>> Everything else (node labels, tip labels) should
>> be in node id order. nodeId can translate between these two orders.
>> Reorder can act on the edge* only since the underlying node ids
>> will not change.
>>
>> Francois: It's definitely a crucial issue. Perhaps we could track
>> node.labels and tip.labels by using named vectors, the names of the
>> vector would be the nodeId.
>>   
> I may be missing smthg here, but isn't this we do when using getnodes?

  I think we don't need more identifiers than node numbers ...

>> Marguerite:
>> This one is very important, and I think it's a very bad idea to unlink
>> the edges and nodes. Edges and nodes are intimately linked. In my
>> mind, the edge is simply the branch below the node. So to have edges
>> in one order and nodes in another order makes no sense to me at all.
>> Why don't we simply give node ID's in "edge" order as you are using
>> it? otherwise, there is HUGE potential for confusion. And we would
>> need yet another index that indicates a mapping of the node ID to the
>> edge matrix.
>>   
> Again, I completely agree. Edges are uniquely identified by their
> desending node, and this is what we have used from the begining.
> Moreover, this is what is used in ape, and I think we should diverge
> from it only when it is mandatory (e.g. plotting trees with singleton if
> these make sense). Most phylobase users are and will be primarly ape users.

  We're not diverging from this.
  We're saying that we will keep data and the lists of node labels
(tips and internal nodes) in order of node numbers, and not rearrange
them every time we reorder the edge matrix.

>> Instead, why don't we just decide on a standard ordering for phylobase
>> number the node ID's in this way, and then allow the edge matrix and
>> nodeID (and all data vectors) to be reordered as needed for whatever
>> functions.  Using the node ID, we can easily  put everything back to
>> the "default" phylobase order, BUT ONLY IF all objects (edge matrix,
>> branch lengths, labels, etc etc are in the SAME order. Don't "break"
>> the integrity of the object just for programming convenience. There is
>> just too much danger for confusion. I, for one, would stop using
>> phylobase, because it's just too hard to remember the peculiarities of
>> the way the object is constructed. Everytime I wanted to do something,
>> I'd have to relearn the rules.
>>   
> Same for me.
> 

  Hmm.
  I've been working to try to make everything consistent in node order
(as Steve suggested).  Thibaut/Marguerite, what do you suggest for the
case of unrooted trees?  Thibaut, how often do you match up edges with
data and labels?
   I've done a bunch of stuff, and I'd like to commit it, because it's
all reasonably consistent now, but I'd like to hear some more
conversation -- I'm willing to work back through while it's fresh in
my mind and do everything the opposite way (keeping everything
in edge-matrix order all the time), provided we know how to handle
unrooted trees (and are willing to live with not being able to handle
reticulations).

  More discussion please?

 Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://lists.r-forge.r-project.org/pipermail/phylobase-devl/attachments/20081227/b7843ee2/attachment.pgp 


More information about the Phylobase-devl mailing list