[Phylobase-devl] unification of tree data slots

Steven Kembel steve.kembel at gmail.com
Thu Sep 17 20:57:08 CEST 2009


Hello,

I've been out of the loop for a while but wanted to quickly say that  
reworking the labels/data to be a single slot and letting accessors  
deal with making it look consistent sounds good. IIRC the main  
argument previously against a single tip/node data slot was the  
storage space issue (i.e. when I load a phylogeny with 20K tips I  
don't want to unnecesarily store node data if it doesn't exist) but it  
sounds like this is no longer an issue since the data are not stored  
if they don't exist?

Cheers,
Steve

On Sep 17, 2009, at 11:54 AM, Jim Regetz wrote:

> Quick reply just about the labels question:
>
> Peter Cowan wrote:
>>>> On Wed, 2009-09-16 at 15:17 -0700, Jim Regetz wrote:
>>>>>
>>>>> Addendum: In case anyone else's mind happens to wander in this
>>>>> direction, yes, I think a similar argument could be made for
>>>>> combining the slots for tip and internal _labels_ into a single
>>>>> label slot, because each label is now unambiguously identified
>>>>> by its name (node ID). Seems like the separation is a
>>>>> historical artifact? Combining them would simplify the
>>>>> corresponding accessor/replace methods, which currently have to
>>>>> look conditionally in either tip.label or node.label depending
>>>>> on the arguments. And it wouldn't be hard at all to make this
>>>>> change in the code base. Of course, I'm not going to ask for
>>>>> the moon *and* the stars, but if someone else proposed it... :)
>>>>>
>>
>> Again, I think performance was the reason here.  The assumption that
>> more often than not trees will not have any internal node labels.
>
> That doesn't have to be a problem. What I said about tree data applies
> even more clearly here: only labels that actually exist need to be in
> the vector. So if you only supply tip labels when you create the tree,
> the (unified) label slot would be exactly the same as what we now call
> tip.label. Example with a 3-tip tree:
>
> ## actual slot contents -- no internal labels stored
>> phy at label
>    1    2    3
> "t2" "t1" "t3"
>
> ## but the accessors would still "fill in" implied the NAs:
>> labels(phy) ## default type is 'all'
>    1    2    3    4    5
> "t2" "t1" "t3"   NA   NA
>
>> tipLabels(phy)
>    1    2    3
> "t2" "t1" "t3"
>
>> nodeLabels(phy)
>  4  5
> NA NA
>
> ## now add internal labels
>> nodeLabels(phy) <- c("n4", "n5")
>> phy at label
>    1    2    3    4    5
> "t2" "t1" "t3" "n4" "n5"
>
> ## and remove them again!
>> nodeLabels(phy) <- as.character(NA)
>> phy at label
>    1    2    3
> "t2" "t1" "t3"
>
> I just quickly wrote up new accessor and replace methods that would
> behave this way. As illustrated above, the replacement method will  
> also
> drop any NA labels it encounters, for efficiency (but obviously  
> attempts
> to do this for tip labels will produce an error).
>
> Jim
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl



More information about the Phylobase-devl mailing list