[Phylobase-devl] Phylobase GSoC idea

Hilmar Lapp hlapp at duke.edu
Tue Mar 11 03:32:32 CET 2008


This sounds great! -hilmar

On Mar 10, 2008, at 8:38 PM, Peter Cowan wrote:

> Here is another idea, this one in need of a mentor.  Thibaut, Ben?
>
> Peter
>
> Rationale
>
> Methods for visualizing phylogenetic trees and associated data have
> not kept pace with growth in tree or dataset size.  At a recent
> NESCent hackathon a new R package for comparative methods, phylobase,
> was designed.  The phylobase package uses modern R classes and methods
> to store and manipulate phylogenetic trees.  Currently plotting of
> phylogenetic trees in phylobase relies on the R base graphics, and
> functions in other packages.  These implementations are not well
> suited to displaying either large phylogenetic trees or trees with
> large amounts of associated data.
>
> Approach
>
> The R programming has two primary plot device interfaces, the base
> graphics interface and the newer, more extensible grid system.  The
> grid system allows for a much more flexible system which will allow
> for consistent scaling and resizing of trees and data.  Current plot
> methods in phylobase are based on base graphics and suffer from
> resizing and layout difficulties.  This project will develop new plot
> methods based on the grid system.
>
> Challenges
>
> The primary challenge will be writing algorithms for efficiently
> converting tree structures to the grid language.  Examples of similar
> algorithms exist for plotting using the old base graphics interface.
>
> Involved toolkits or projects
>
> phylobase, R, C/C++, grid
>
> Mentors
>
> Thibaut?
>
> On Mar 10, 2008, at 1:51 PM, Steve Kembel wrote:
>
>> Hi all,
>>
>> Here's a Google Summer of Code 'idea'. Deadline for getting these up
>> on the wiki is today. Thoughts? Edits? Anyone else want to sign up to
>> be a mentor? Any other ideas? People suggested plotting, RUnit/
>> testing, linking with nexml or phyloxml...?
>>
>> Rationale
>>
>> There is a need for efficient phylogenetic tree manipulation methods
>> in the R statistical package to take advantage of the statistical
>> computing ability of R for bioinformatics and comparative  
>> phylogenetic
>> analyses. NESCent sponsored a hackathon focused on integration of
>> comparative methods within the R statistical package to promote
>> interoperability, the support of data exchange standards, and greater
>> usability of tools and methods in evolutionary bioinformatics. One
>> result of this hackathon has been the development of the phylobase
>> package, which seeks to provide a set of S4 classes and methods for
>> representing and manipulating phylogenetic trees and data in R.
>> Currently phylobase contains structures for representing phylogenetic
>> trees and associated data, but methods for tree manipulation remain
>> incomplete or have not been optimized. Current implementation of
>> phylogenetic tree storage and manipulation are inadequate for working
>> the large tree and multiple tree datasets that are increasingly  
>> common
>> in bioinformatics and comparative biology.
>>
>> Approach
>>
>> The R programming language, an object-oriented statistical  
>> programming
>> language, has recently introduced a new objecet-oriented class system
>> (S4). Phylogenetic trees in phylobase are currently represented as S4
>> data objects. The methods for tree manipulation are currently a
>> mixture of S3 and S4 methods and C/C++ extensions. The approach for
>> this project will be to identify obstacles to manipulating large  
>> trees
>> and datasets, which could include optimizing tree or data
>> representation in memory, and to develop  and implement efficient
>> algorithms for tree representation and manipulation using object-
>> oriented S4 classes and methods or C/C++ extensions.
>>
>> Challenges
>>
>> While the R statistical programming language is extremely powerful  
>> and
>> provides a rich feature set, it is inefficient at handling very large
>> objects and heavy computational lifting (recursion, for-loops). The
>> general challenge for this project will be to identify data  
>> structures
>> and methods that have the greatest impact on the ability to work with
>> very large trees and datasets, and to implement these structures and
>> methods in a more efficient way. This will require profiling and
>> testing of existing code, the use of S4 classes and methods, and
>> possibly the R API and C/C++ extensions to the R language.
>>
>> Involved toolkits or projects
>>
>> phylobase, R, S4 classes
>>
>> Mentors
>>
>> Steven Kembel, ?
>> _______________________________________________
>> Phylobase-devl mailing list
>> Phylobase-devl at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/ 
>> phylobase-devl
>
> _______________________________________________
> Phylobase-devl mailing list
> Phylobase-devl at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/ 
> phylobase-devl

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================






More information about the Phylobase-devl mailing list