[Phylobase-devl] several proposed additions

Peter Cowan pdc at berkeley.edu
Thu Jul 2 02:20:33 CEST 2009


Hi all,

I'm sorry that I haven't had much time to work on phylobase and won't  
for the next few months, but I did want to chime in.

I'm all for the testing, it's been on my todo list to get a testing  
framework in place for the package, so a big thanks to Jim for doing  
that.

I too had written a really simple NAMESPACE file, but didn't committed  
it.  I don't have strong feelings either way, but there is at least  
one function that I would like to hide, but cannot name with a leading  
dot (due to grid issues). This is a pain with a NAMESPACE, but it  
isn't even possible without one.

Peter

On Jun 25, 2009, at 5:27 AM, François Michonneau wrote:

> Hi Jim
>
>  Go ahead and commit your fixes. I don't think your changes should
> affect me too much. I have mostly been working on redesigning the way
> data and tree are matched and on cleaning up the code.
>
>  The masking problem of nTips from ape is resolved since we changed  
> the
> naming convention of the functions to camelCase (I think).
>
>  RUnit sounds really useful. I'm definitely going to look more into
> this.
>
>  I don't know much about namespaces, but if they make the life of
> people using phylobase to build their packages easier, then we should
> use it.
>
>  Cheers,
>  -- François
>
> On Wed, 2009-06-24 at 17:45 -0700, Jim Regetz wrote:
>> Followup responses/questions throughout, below:
>>
>> Ben Bolker wrote:
>>> Hilmar Lapp wrote:
>>>> Hi Jim -
>>>>
>>>> this sounds great! I can't speak on behalf of the phylobase  
>>>> leads, but
>>>> can you commit all your changes to a branch? That would make it
>>>> easiest for people to check out and play with your code. The unit
>>>> tests sound cool - haven't been aware of this, but if it is what  
>>>> the
>>>> name suggests it will be highly useful.
>>>>
>>>> 	-hilmar
>>
>> Hilmar: I was thinking of this too, but the changes are not as  
>> extensive
>> as they seem at first blush. The namespace is just a single file  
>> that we
>> can easily take out if it causes hassles. The tests don't affect the
>> functioning of the package in any way, so no harm there. And I've  
>> tried
>> to ensure that the phylo4 methods changes don't change any existing
>> behavior other than how it handles numeric node labels (and this  
>> needed
>> to be massaged in any case). Trunk commit should be fine -- agreed?
>>
>>>> On Jun 24, 2009, at 3:45 AM, Jim Regetz wrote:
>>>>>
>>>>> * NAMESPACE: I've created one. Seems to work fine. I'm basically
>>>>> exporting all functions and methods now. I figure it's easier to  
>>>>> start
>>>>> with a liberal approach, and later remove whatever items folks  
>>>>> agree
>>>>> are
>>>>> only required internally.
>>>
>>>  We've had a lot of arguments about whether to use namespaces or
>>> not (I have been very frustrated by namespaces in the past, partly
>>> because I don't really understand them), but I don't think it's
>>> a bad idea.
>>>
>>
>> Ben: Did your frustration involve creating namespaces, or working  
>> with
>> packages that use them? I acknowledge one issue is that it  
>> complicates
>> on-the-fly modification of functions within an R session, which can  
>> be a
>> little annoying when debugging/developing: e.g., if you use fix() to
>> modify function getNode in an R session, other exported phylobase
>> functions will nevertheless continue to use the original getNode
>> definition. But using fixInNamespace() or just sourcing the pkg/R
>> directory when developing can help.
>>
>> An advantage is that other packages can leverage phylobase by  
>> flexibly
>> importing its methods/generics. For example, I'm working on new  
>> package
>> that depends on phylobase. To use subset("phylo4", ...), I currently
>> have to specify phylobase:::subset() in my code because it's  
>> otherwise
>> not treated as an S4 generic. However, if subset is exported in the
>> phylobase NAMESPACE, I can import it in my package's NAMESPACE and
>> method dispatch works as expected.
>>
>> Also FYI, I think the namespace will also obviate the need for the  
>> line
>> of code in methods-phylo4.R that masks ape::nTips, because nTips as
>> defined in phylobase will always take precedence? (I'm going out on a
>> limb a little bit here, because I'm not quite sure what the masking
>> problem was in the first place :)
>>
>> And of course, there are the standard benefits of hiding internal
>> functions, and knowing that functions will work right regardless of  
>> what
>> synonymous objects users create in their global environment or load  
>> via
>> other packages.
>>
>> Perhaps these are somewhat minor benefits in isolation, but I think  
>> it's
>> the right way to go given that phylobase is intended to serve as a
>> foundation package that more specialized phylo packages will depend  
>> on.
>>
>>>>> * Unit tests: I've created a handful of RUnit tests, and added the
>>>>> necessary "run tests" machinery as documented here:
>>>>> http://wiki.r-project.org/rwiki/doku.php?id=developers:runit
>>>>>
>>>>> The tests are not even remotely comprehensive, but they do test
>>>>> against
>>>>> the bug fixes I've implemented in the last week or so. I find unit
>>>>> tests
>>>>> to be hugely helpful in making sure that fixed bugs stay fixed,  
>>>>> and
>>>>> that
>>>>> code changes in one area don't unexpectedly break things somewhere
>>>>> else.
>>>>> And I find RUnit testing to provide better structure and test
>>>>> expressivity than you get with unstructured test scripts in the  
>>>>> pkg/
>>>>> test
>>>>> directory.
>>>>>
>>>
>>>  sounds great.
>>
>> OK I'll commit them when I get a chance. Worse case scenario is that
>> they break the R-forge check; as standalone tests, they certainly  
>> don't
>> affect package functioning in anyway.
>>
>>>>> I'm not 100% certain that R-forge would do the right thing with  
>>>>> these
>>>>> tests without some additional tweaking, but they are successfully
>>>>> run by
>>>>> R CMD CHECK on my local machine (at least as long as phylobase is
>>>>> already installed).
>>>>>
>>>>> * phylo4 generic: Any reason why there has been a generic for  
>>>>> phylo4d,
>>>>> but not phylo4? I've now created a phylo4 generic, along with two
>>>>> methods: one with signature x="matrix" (replacing the original  
>>>>> phylo4
>>>>> constructor function), and one with signature x="phylo" (imports  
>>>>> phylo
>>>>> to phylo4). I updated the associated doc pages to match, plus  
>>>>> fixed a
>>>>> few places in the examples and sources where phylo(edge=...) was
>>>>> called
>>>>> explicitly (the first argument is now the more generic 'x', not
>>>>> 'edge').
>>>>>
>>>>> * Dealing with messy node labels: My motivation for creating the
>>>>> phylo4
>>>>> generic was to expose an option allowing node labels to be dropped
>>>>> (or,
>>>>> in the case of phylo4d, added to the tree data) during phylo  
>>>>> import.
>>>>> This is basically just a modified implementation of what François
>>>>> did to
>>>>> deal with MrBayes sorts of trees, except now focused  
>>>>> specifically on
>>>>> phylo import (it could be added elsewhere too), and now with user-
>>>>> level
>>>>> control of the behavior. I'm glossing over the details, but for  
>>>>> those
>>>>> not paying attention to the bug tracker, see more discussion here:
>>>>> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=466&group_id=111&atid=488
>>>>>
>>>>> FYI, R CMD CHECK passes on all the above changes (well, aside  
>>>>> from a
>>>>> latex documentation issue that I think has been around for some  
>>>>> time).
>>>>>
>>>>> I'll be traveling much of the next month, but would be happy to
>>>>> discuss
>>>>> any of this further. And of course, I can push some/all of this  
>>>>> up to
>>>>> the repository at any time, or send relevant code to the list if
>>>>> requested.
>>>>>
>>>
>>>  Please do push.  We need to move ahead.  I want to spend more time
>>> working this summer (maybe with Francois), but probably won't happen
>>> until second half of July ...
>>
>> Great! François, can I commit these phylo4/phylo4d changes without
>> disrupting you? I prefer to deal with this before the namespace and  
>> unit
>> tests, because they both reflect the local changes I've made to
>> phylo4/phylo4d.
>>
>> To summarize, dropping node labels will become an optional feature of
>> both phylo4(x="phylo") (which is itself a new method) and
>> phylo4d(x="phylo"), with the latter also allowing the node labels  
>> to be
>> preserved as tree data. In the labels-as-data case, the code  
>> currently
>> uses the same as.numeric() call that you had, resulting in NAs if you
>> try to convert non-number-like node labels (e.g., "a", "b") to  
>> data. But
>> it would be easy enough to add smarter handling of this later if  
>> we're
>> happy with the overall approach.
>>
>> Out of curiosity, in the short term do we want this node label  
>> dropping
>> option to exist in places beyond just ape tree import functions?
>>
>> Thanks!
>> Jim
>>
>>>  Ben
>>>
>> _______________________________________________
>> Phylobase-devl mailing list
>> Phylobase-devl at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl
>
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/phylobase-devl



More information about the Phylobase-devl mailing list