PhyloSoC:BioPerl integration of the NeXML exchange standard and Bio::Phylo toolkit

From Phyloinformatics
Revision as of 09:19, 20 May 2009 by Chmille4 (talk) (Authors)
Jump to: navigation, search

2009 Google Summer of Code Project

Title

BioPerl integration of the NeXML exchange standard and Bio::Phylo toolkit


Authors

Student: Chase Miller

Mentors: Mark A. Jensens (primary) Rutger Vos

Abstract

This project will integrate the NeXML exchange standard into BioPerl, facilitating the adoption of this standard and easing the transition from the overworked NEXUS standard. A wrapper will be used to allow BioPerl native access to the preferred NeXML parser (Bio::Phylo), allowing Bio::Phylo and NeXML to co-evolve without being encumbered by BioPerl. Additionally, test cases and example sets will be developed that target real world uses.


Project Plan

May 23rd - June 5th

BioPerl Objects to NeXML

  • Write module for converting BioPerl data into the NeXML standard by implementing the BioPerl objects interfaces using Bio::Phylo APIs. Using the bioperl interfaces allows Bio::Phylo, which is an external object, to be used as if it were a native BioPerl object.

June 6th - June 12th

NeXML to BioPerl Objects

  • Continue to write the module, focusing on implementing the methods from the BioPerl objects interfaces needed for reading a NeXML file. Using the bioperl interfaces allows Bio::Phylo, which is an external object, to be used as if it were a native BioPerl object.

June 13th - June 19th

Create Wrapper

  • refactor code from wk 1, 2, and 3 into stable wrapper in Bio:Seq:nexml, Bio:Align:nexml, and Bio:Tree:nexml

June 27th - July 3rd

Test Data Sets

  • Generate test data sets that will be used for both benchmarking and test use cases

July 4th - July 10th

Benchmark

  • Use BioPerl to query large data sets from GenBank

Validation

  • Use the online NeXML validator at http://www.nexml.org to check the NeXML produced by the new modules

Potential Cases

Start Midterm evaluation

July 11th - July 17th

Use Cases

  • Generate real world examples based on test data sets with detailed documentation that will walk someone step by step through the new wrapper and the functionality of Bio::Phylo from a BioPerl perspective

July 18th - July 24th

Code Optimization

  • Optimize code based on results from benchmark

July 25th - July 31st

Incorporate new module into BioPerl

  • BioPerl-dev – the repository for experimental packages
  • BioPerl-live – the repository for modules slated for core releases (Only will go here if BioPerl Core Devs think that it works well and is useful enough to be packaged as part of the core)

August 1st - August 7th

Debugging and Testing

August 8th - August 10th

Finish documentation and wrap up

August 17th - August 24th

Begin and finish final evaluations


Deliverables

  • Native NeXML integration into BioPerl
  • Full Bio::Phylo functionality from within BioPerl
  • Example files and test cases to facilitate adoption of NeXML