[Phylobase-devl] unification of tree data slots
Jim Regetz
regetz at nceas.ucsb.edu
Mon Sep 21 23:05:49 CEST 2009
Okay, I think the responses to this proposal ranged from somewhat
hesitant to definitely supportive, with center of mass somewhere on the
positive side of neutral :)
I think it's worth giving this a shot. And because it would (I believe)
cleanly fix some existing bugs/buglets that I'd rather not patch up with
workarounds, I'd prefer to try it now.
Perhaps a branch is in order? I think the changes can be implemented
without too much pain, but it would be nice to know I/we can commit
partial changes if need be, without worrying about passing package check
with every commit.
Please let me know if you don't think I captured the group sentiment, or
if you have other reactions/thoughts.
Thanks!
Jim
Ben Bolker wrote:
> Agreed. I think the only concern is the "changing things around"
> issue. I'm OK with the idea that if people have node data for just a
> few nodes, then they have to pay the cost of storing NAs for all the
> rest. I am much happier with the "changing things around" plan now that
> we are starting to have a halfway-decent testing framework so that we
> can be slightly more certain that we're not f*cking everything up by
> making changes ...
>
> So I'd say I'm a +0 -- I'm not going to argue against it, but I won't
> do the work either :-)
> At some point I *do* want to get back into helping develop, but I
> can't even afford the time to get back up to speed about the current
> status ...
>
> cheers
> Ben
>
> Steven Kembel wrote:
>> Hello,
>>
>> I've been out of the loop for a while but wanted to quickly say that
>> reworking the labels/data to be a single slot and letting accessors
>> deal with making it look consistent sounds good. IIRC the main
>> argument previously against a single tip/node data slot was the
>> storage space issue (i.e. when I load a phylogeny with 20K tips I
>> don't want to unnecesarily store node data if it doesn't exist) but it
>> sounds like this is no longer an issue since the data are not stored
>> if they don't exist?
>>
>> Cheers,
>> Steve
>>
>> On Sep 17, 2009, at 11:54 AM, Jim Regetz wrote:
>>
>>> Quick reply just about the labels question:
>>>
>>> Peter Cowan wrote:
>>>>>> On Wed, 2009-09-16 at 15:17 -0700, Jim Regetz wrote:
>>>>>>> Addendum: In case anyone else's mind happens to wander in this
>>>>>>> direction, yes, I think a similar argument could be made for
>>>>>>> combining the slots for tip and internal _labels_ into a single
>>>>>>> label slot, because each label is now unambiguously identified
>>>>>>> by its name (node ID). Seems like the separation is a
>>>>>>> historical artifact? Combining them would simplify the
>>>>>>> corresponding accessor/replace methods, which currently have to
>>>>>>> look conditionally in either tip.label or node.label depending
>>>>>>> on the arguments. And it wouldn't be hard at all to make this
>>>>>>> change in the code base. Of course, I'm not going to ask for
>>>>>>> the moon *and* the stars, but if someone else proposed it... :)
>>>>>>>
>>>> Again, I think performance was the reason here. The assumption that
>>>> more often than not trees will not have any internal node labels.
>>> That doesn't have to be a problem. What I said about tree data applies
>>> even more clearly here: only labels that actually exist need to be in
>>> the vector. So if you only supply tip labels when you create the tree,
>>> the (unified) label slot would be exactly the same as what we now call
>>> tip.label. Example with a 3-tip tree:
>>>
>>> ## actual slot contents -- no internal labels stored
>>>> phy at label
>>> 1 2 3
>>> "t2" "t1" "t3"
>>>
>>> ## but the accessors would still "fill in" implied the NAs:
>>>> labels(phy) ## default type is 'all'
>>> 1 2 3 4 5
>>> "t2" "t1" "t3" NA NA
>>>
>>>> tipLabels(phy)
>>> 1 2 3
>>> "t2" "t1" "t3"
>>>
>>>> nodeLabels(phy)
>>> 4 5
>>> NA NA
>>>
>>> ## now add internal labels
>>>> nodeLabels(phy) <- c("n4", "n5")
>>>> phy at label
>>> 1 2 3 4 5
>>> "t2" "t1" "t3" "n4" "n5"
>>>
>>> ## and remove them again!
>>>> nodeLabels(phy) <- as.character(NA)
>>>> phy at label
>>> 1 2 3
>>> "t2" "t1" "t3"
>>>
>>> I just quickly wrote up new accessor and replace methods that would
>>> behave this way. As illustrated above, the replacement method will
>>> also
>>> drop any NA labels it encounters, for efficiency (but obviously
>>> attempts
>>> to do this for tip labels will produce an error).
>>>
>>> Jim
>>> _______________________________________________
>>> Phylobase-devl mailing list
>>> Phylobase-devl at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl
>
>
More information about the Phylobase-devl
mailing list