[Phylobase-devl] accessor and replace method consistency

Peter Cowan pdc at berkeley.edu
Sat Sep 5 03:21:02 CEST 2009


On Sep 4, 2009, at 5:37 PM, Jim Regetz wrote:

> Hi all,
>
> I promise this will be my last biggie proposal before the release.
>
> We've found and fixed a number of bugs resulting from misaligning the
> elements of various slots (labels and nodes, edges and lengths, nodes
> and data, etc). I've noticed a few new ones that I haven't reported  
> yet
> (in plot, prune, and tdata). Partly these occur because we sometimes
> assume too much about slot data when accessing them directly via the @
> operator -- assumptions that tend to be true of the test data we  
> usually
> use, but aren't in fact enforced. And partly this is because the
> accessor methods themselves are not consistent across slots (and have
> changed over time).

Yes, thank you Jim and Francois for identifying and fixing many of  
these mismatches/bugs.  I think the agreed upon strategy is to promote  
the use of the accessor methods for getting the elements.  I for one  
am quite guilty of not following that advise.

[snip]

> * The following should always return the vectors ordered by node ID,  
> in
> ascending order:
>
> nodeId
> nodeType (already does)
> labels
> tipLabels
> nodeLabels
>
> This means the ordering will *always* correspond to 1:nTips for tips,
> (nTips+1):(nTips+nNodes) for internal nodes, and 1:(nTips+nNodes) for
> the 'all' case. In the case of missing labels, the accessor should  
> still
> return a complete vector of NAs.
>
> * Similarly, tdata (whether tip, internal, or all) should always  
> return
> a data frame with rows ordered by node ID, in ascending order.
>
> * edge should always returns the edge matrix as-is, because the actual
> row order can matter for traversal methods.
>
> * edgeLength and edgeLabel should always return the vectors in an  
> order
> that matches the rows of the current edge matrix. As with node labels,
> in the case of missing edge lengths or labels, the accessor should  
> still
> return a complete vector of NAs.
>
> * For Replace methods, when names are not used, the replacement  
> ordering
> should corresponding directly to accessor ordering (and not the  
> existing
> internal ordering). For example, even if the tip.label slot is:
>
> x at tip.label
>   2    3    1
> "t2" "t3" "t1"
>
> then tipLabels(x) <- c("tA", "tB", "tC") should still produce this:
>
> x at tip.label
>   1    2    3
> "tA" "tB" "tC"
>
> It already does, happily, but not all of the setters are like this.
>
> Incidentally, if we modify nodeId as I proposed, then the pretty print
> method would show the object with tips first, then the internal  
> nodes. I
> realize that's different from what we have now, and maybe I missed an
> earlier discussion about this, but I would be totally fine with  
> that. Of
> course, if folks always want internal nodes first, then tips, that's
> fine, and we can tweak the print method accordingly. Either way  
> would be
> better than now IMHO, as the printed order can change when the edge
> order changes (even with the pretty method).
>
> Whew. I'm happy to debate these (and can certainly change my mind on
> details), but in any case I think it would be a boon for developers  
> and
> users alike to have more consistency on these fronts.

+1 from me.  Anything that is consistent *and* documented would be a  
nice improvement in my book.

This sounds like a pretty significant change.  How long do you think  
it would take to pull off?  I can probably find some time to help w/  
this if needed, especially after Wed.

Peter

>
> Jim
>
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl



More information about the Phylobase-devl mailing list