[Phylobase-devl] unification of tree data slots

Jim Regetz regetz at nceas.ucsb.edu
Wed Sep 23 20:34:40 CEST 2009


Hi all,

In the slot-mods branch, phylo4 now has a single 'label' slot, replacing 
the separate 'tip.label' and 'node.label' slots in the original class 
definition. As proposed, the tipLabels, nodeLabels, and labels accessors 
all return exactly the same thing as before.

One other modification of note: I also changed the labels<- default type 
to 'all' instead of 'tip', matching the labels accessor. The idea here 
is that the accessor and replacement forms of both tipLabels and 
nodeLabels already provide shortcuts for working solely with tip and 
node labels, respectively, so 'all' is a more sensible default for the 
generalized labels accessor and replacement methods.

I believe I've updated all parts of the package code that are, by 
necessity, more tightly coupled to the internal slot configuration, and 
as far as I can tell, everything is working properly.

I also intend to change the phylo4 constructor methods so that they 
don't create and store explicit NAs for node labels when node labels are 
not provided. I figure the same change is in order for edge labels and 
edge lengths, too. In all cases the accessors will still explicitly 
return NAs for missing values, so again, this change will be transparent 
if you use accessors.

Then I'll tackle combining the data slots next. This will require making 
a couple of decisions that I'll pose to the list in a separate post.

Cheers,
Jim

Jim Regetz wrote:
> Quick reply just about the labels question:
> 
> Peter Cowan wrote:
>>>> On Wed, 2009-09-16 at 15:17 -0700, Jim Regetz wrote:
>>>>> Addendum: In case anyone else's mind happens to wander in this 
>>>>> direction, yes, I think a similar argument could be made for 
>>>>> combining the slots for tip and internal _labels_ into a single
>>>>> label slot, because each label is now unambiguously identified
>>>>> by its name (node ID). Seems like the separation is a
>>>>> historical artifact? Combining them would simplify the
>>>>> corresponding accessor/replace methods, which currently have to
>>>>> look conditionally in either tip.label or node.label depending
>>>>> on the arguments. And it wouldn't be hard at all to make this 
>>>>> change in the code base. Of course, I'm not going to ask for
>>>>> the moon *and* the stars, but if someone else proposed it... :)
>>>>>
>> Again, I think performance was the reason here.  The assumption that
>>  more often than not trees will not have any internal node labels.
> 
> That doesn't have to be a problem. What I said about tree data applies 
> even more clearly here: only labels that actually exist need to be in 
> the vector. So if you only supply tip labels when you create the tree, 
> the (unified) label slot would be exactly the same as what we now call 
> tip.label. Example with a 3-tip tree:
> 
> ## actual slot contents -- no internal labels stored
>  > phy at label
>     1    2    3
> "t2" "t1" "t3"
> 
> ## but the accessors would still "fill in" implied the NAs:
>  > labels(phy) ## default type is 'all'
>     1    2    3    4    5
> "t2" "t1" "t3"   NA   NA
> 
>  > tipLabels(phy)
>     1    2    3
> "t2" "t1" "t3"
> 
>  > nodeLabels(phy)
>   4  5
> NA NA
> 
> ## now add internal labels
>  > nodeLabels(phy) <- c("n4", "n5")
>  > phy at label
>     1    2    3    4    5
> "t2" "t1" "t3" "n4" "n5"
> 
> ## and remove them again!
>  > nodeLabels(phy) <- as.character(NA)
>  > phy at label
>     1    2    3
> "t2" "t1" "t3"
> 
> I just quickly wrote up new accessor and replace methods that would 
> behave this way. As illustrated above, the replacement method will also 
> drop any NA labels it encounters, for efficiency (but obviously attempts 
> to do this for tip labels will produce an error).
> 
> Jim
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl


More information about the Phylobase-devl mailing list