PhyloSoC:Mapping the Bio++ Phylogenetics toolkit to R/BioConductor and BioJAVA using BioLib

From Phyloinformatics
Jump to: navigation, search

Author

Adam A. Smith, Computer Sciences, University of Wisconsin, Madison

Abstract

Bio++ is a C++ library for phylogenetics algorithms. Unfortunately, few bioinformatics scientists work in C++. The plan is therefore to translate the library into multiple other languages, including Java and R. (Perl, Python, and Ruby have been done.) Doing so should allow us to reach the majority of active bioinformaticians.

Project Plan

Community bonding Period:

  • Download, read, and thoroughly understand the original C++ code base.
  • Look at other translations that have been made.
  • Review the algorithms being implemented.
  • Ask community for links to other references that might be useful.
  • Set up wiki page for project.

May 20 - May 24:

  • Familiarize self with tools, especially SWIG.
  • Practice using SWIG for toy examples.
  • Look into automated testing frameworks, especially automated unit testing ones. (e.g. xUnit/JUnit.)
  • Make decision about whether to use the automated framework, or do testing manually.
  • If automated, start creating testing code.

May 25 - June 7 (2 Weeks):

  • Translate and test Utils library.
    • Add proper code to the C++ library to enable SWIG translation, if not already done.
    • Translate library into Java using SWIG.
    • Unit test Java translation. Compare every function side-by-side with original C++ library for results that are exactly the same, including in significance and rounding errors.
    • Translate library into R.
    • Unit test R translation, as above.

June 8 - June 14:

  • Translate and test NumCalc.
    • Enable SWIG translation.
    • Translate into Java.
    • Unit test Java.
    • Translate into R.
    • Unit test R.

June 15 - June 21:

  • Translate and test SeqLib.
    • Enable SWIG translation.
    • Translate into Java.
    • Unit test Java.
    • Translate into R.
    • Unit test R.

June 22 - June 26:

  • Translate and test PhylLib.
    • Enable SWIG translation.
    • Translate into Java.
    • Unit test Java.
    • Translate into R.
    • Unit test R.

June 27 - July 3:

Break: Attending ISMB conference.

July 4 - July 8:

  • Translate and test PopGenLib.
    • Enable SWIG translation.
    • Translate into Java.
    • Unit test Java.
    • Translate into R.
    • Unit test R.

July 9 - July 13:

  • Debugging, as is necessary.
  • Submit to wide audience for feedback and bug-testing on other computers. (This will help ensure that there aren't any computer-specific bugs of which I am unaware. It will also help make sure that other people can install the new libraries without problem.)

July 14 - July 24:

Break: Family vacation.

July 25 - August 9 (2 Weeks):

  • Write tool (probably in Python) that parses SWIG's XML documentation output. Output parsed documentation in Java javadoc, Perl POD, Ruby rdoc, R rd, and possibly Python pydoc.
  • Review and manually fix Java documentation.
  • Review and manually fix R documentation.
  • Review and manually fix Perl documentation.
  • Review and manually fix Ruby documentation.

August 10 - August 17:

  • Wrap-up final Google requirements.
  • Continued debugging, if necessary.
  • Complete the integration with the existing code-base.
  • Fix any last-minute problems that arise.

Beyond:

  • Ensure that project is running smoothly. Fix any bugs that pop up.