[Phylobase-devl] accessor and replace method consistency
Peter Cowan
pdc at berkeley.edu
Sat Sep 5 03:21:02 CEST 2009
On Sep 4, 2009, at 5:37 PM, Jim Regetz wrote:
> Hi all,
>
> I promise this will be my last biggie proposal before the release.
>
> We've found and fixed a number of bugs resulting from misaligning the
> elements of various slots (labels and nodes, edges and lengths, nodes
> and data, etc). I've noticed a few new ones that I haven't reported
> yet
> (in plot, prune, and tdata). Partly these occur because we sometimes
> assume too much about slot data when accessing them directly via the @
> operator -- assumptions that tend to be true of the test data we
> usually
> use, but aren't in fact enforced. And partly this is because the
> accessor methods themselves are not consistent across slots (and have
> changed over time).
Yes, thank you Jim and Francois for identifying and fixing many of
these mismatches/bugs. I think the agreed upon strategy is to promote
the use of the accessor methods for getting the elements. I for one
am quite guilty of not following that advise.
[snip]
> * The following should always return the vectors ordered by node ID,
> in
> ascending order:
>
> nodeId
> nodeType (already does)
> labels
> tipLabels
> nodeLabels
>
> This means the ordering will *always* correspond to 1:nTips for tips,
> (nTips+1):(nTips+nNodes) for internal nodes, and 1:(nTips+nNodes) for
> the 'all' case. In the case of missing labels, the accessor should
> still
> return a complete vector of NAs.
>
> * Similarly, tdata (whether tip, internal, or all) should always
> return
> a data frame with rows ordered by node ID, in ascending order.
>
> * edge should always returns the edge matrix as-is, because the actual
> row order can matter for traversal methods.
>
> * edgeLength and edgeLabel should always return the vectors in an
> order
> that matches the rows of the current edge matrix. As with node labels,
> in the case of missing edge lengths or labels, the accessor should
> still
> return a complete vector of NAs.
>
> * For Replace methods, when names are not used, the replacement
> ordering
> should corresponding directly to accessor ordering (and not the
> existing
> internal ordering). For example, even if the tip.label slot is:
>
> x at tip.label
> 2 3 1
> "t2" "t3" "t1"
>
> then tipLabels(x) <- c("tA", "tB", "tC") should still produce this:
>
> x at tip.label
> 1 2 3
> "tA" "tB" "tC"
>
> It already does, happily, but not all of the setters are like this.
>
> Incidentally, if we modify nodeId as I proposed, then the pretty print
> method would show the object with tips first, then the internal
> nodes. I
> realize that's different from what we have now, and maybe I missed an
> earlier discussion about this, but I would be totally fine with
> that. Of
> course, if folks always want internal nodes first, then tips, that's
> fine, and we can tweak the print method accordingly. Either way
> would be
> better than now IMHO, as the printed order can change when the edge
> order changes (even with the pretty method).
>
> Whew. I'm happy to debate these (and can certainly change my mind on
> details), but in any case I think it would be a boon for developers
> and
> users alike to have more consistency on these fronts.
+1 from me. Anything that is consistent *and* documented would be a
nice improvement in my book.
This sounds like a pretty significant change. How long do you think
it would take to pull off? I can probably find some time to help w/
this if needed, especially after Wed.
Peter
>
> Jim
>
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl
More information about the Phylobase-devl
mailing list