[Phylobase-devl] unification of tree data slots
Jim Regetz
regetz at nceas.ucsb.edu
Wed Sep 23 20:34:40 CEST 2009
Hi all,
In the slot-mods branch, phylo4 now has a single 'label' slot, replacing
the separate 'tip.label' and 'node.label' slots in the original class
definition. As proposed, the tipLabels, nodeLabels, and labels accessors
all return exactly the same thing as before.
One other modification of note: I also changed the labels<- default type
to 'all' instead of 'tip', matching the labels accessor. The idea here
is that the accessor and replacement forms of both tipLabels and
nodeLabels already provide shortcuts for working solely with tip and
node labels, respectively, so 'all' is a more sensible default for the
generalized labels accessor and replacement methods.
I believe I've updated all parts of the package code that are, by
necessity, more tightly coupled to the internal slot configuration, and
as far as I can tell, everything is working properly.
I also intend to change the phylo4 constructor methods so that they
don't create and store explicit NAs for node labels when node labels are
not provided. I figure the same change is in order for edge labels and
edge lengths, too. In all cases the accessors will still explicitly
return NAs for missing values, so again, this change will be transparent
if you use accessors.
Then I'll tackle combining the data slots next. This will require making
a couple of decisions that I'll pose to the list in a separate post.
Cheers,
Jim
Jim Regetz wrote:
> Quick reply just about the labels question:
>
> Peter Cowan wrote:
>>>> On Wed, 2009-09-16 at 15:17 -0700, Jim Regetz wrote:
>>>>> Addendum: In case anyone else's mind happens to wander in this
>>>>> direction, yes, I think a similar argument could be made for
>>>>> combining the slots for tip and internal _labels_ into a single
>>>>> label slot, because each label is now unambiguously identified
>>>>> by its name (node ID). Seems like the separation is a
>>>>> historical artifact? Combining them would simplify the
>>>>> corresponding accessor/replace methods, which currently have to
>>>>> look conditionally in either tip.label or node.label depending
>>>>> on the arguments. And it wouldn't be hard at all to make this
>>>>> change in the code base. Of course, I'm not going to ask for
>>>>> the moon *and* the stars, but if someone else proposed it... :)
>>>>>
>> Again, I think performance was the reason here. The assumption that
>> more often than not trees will not have any internal node labels.
>
> That doesn't have to be a problem. What I said about tree data applies
> even more clearly here: only labels that actually exist need to be in
> the vector. So if you only supply tip labels when you create the tree,
> the (unified) label slot would be exactly the same as what we now call
> tip.label. Example with a 3-tip tree:
>
> ## actual slot contents -- no internal labels stored
> > phy at label
> 1 2 3
> "t2" "t1" "t3"
>
> ## but the accessors would still "fill in" implied the NAs:
> > labels(phy) ## default type is 'all'
> 1 2 3 4 5
> "t2" "t1" "t3" NA NA
>
> > tipLabels(phy)
> 1 2 3
> "t2" "t1" "t3"
>
> > nodeLabels(phy)
> 4 5
> NA NA
>
> ## now add internal labels
> > nodeLabels(phy) <- c("n4", "n5")
> > phy at label
> 1 2 3 4 5
> "t2" "t1" "t3" "n4" "n5"
>
> ## and remove them again!
> > nodeLabels(phy) <- as.character(NA)
> > phy at label
> 1 2 3
> "t2" "t1" "t3"
>
> I just quickly wrote up new accessor and replace methods that would
> behave this way. As illustrated above, the replacement method will also
> drop any NA labels it encounters, for efficiency (but obviously attempts
> to do this for tip labels will produce an error).
>
> Jim
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl
More information about the Phylobase-devl
mailing list