PhyloSoC:Matrix display of phenotype annotations using ontologies in Phenote

From Phyloinformatics
Revision as of 10:28, 7 July 2008 by Mwallinga (talk) (Project Overview and Timeline)
Jump to: navigation, search

Project Description

Phenote is an application which facilitates the annotation of biological phenotypes using ontologies. While initially developed for model organism mutants, the Phenoscape project has extended Phenote to efficiently annotate evolutionary phenotypes. Phenotype annotations are currently edited within Phenote in a list interface, allowing new phenotypic states to be annotated without pre-defining characters. However, a way to view annotations in a matrix form, showing character values grouped by taxon and character, would be familiar to evolutionary biologists and aid in evaluation of taxonomic annotation coverage and phylogenetic patterns.

This project proposes the creation of this matrix view in Phenote. The interface would be implemented using Java Swing, most likely through the development of reusable customized component classes. Model classes will implement algorithms for populating this view by consolidating phenotype annotations into characters using the ontology hierarchical structure. This will be accomplished by finding values which descend from the same attribute within the PATO phenotype ontology, as implemented in Phenote's prototype NEXUS exporter.

Several additional features will give the matrix view increased functionality. First, data export compatibility will be added to increase Phenote's compatibility with other matrix-oriented evolutionary software. The matrix view will have flexibility in sorting and grouping results, such as the ability to narrow the display to subsets of particular ontologies. The view will be further customizable via user-definable options for how annotations are grouped into characters. Finally, users will be able to define characters by hand within the Swing interface.

Project Overview and Timeline

1. May 19 – May 24: Initial setup, environment configuration, familiarization with existing code

2. May 27 – June 7: The initial focus will be on the program's model classes. The code will have to access existing Phenote data and ontologies, and refresh the data as changes are made to the rest of the application. Appropriate data structures will be developed for organizing and representing the data in matrix form.

3. June 9 – June 13: Development of an interface component to hold the matrix view. This may involve creating a custom component class that uses multiple JTables.

4. June 16 – June 21: Integration and testing of the matrix view component with the full Phenote code base. Attend the Evolutionary Biology and Ontologies Workshop in the Twin Cities.

5. June 23 – June 28: This week is reserved to respond to feedback received at the workshop, and to make any revisions to the interface or model code based on that feedback. If very little needs to be done, step 6 will start early.

6. June 30 – July 5: Add an exporting feature to allow compatibility with other matrix-oriented software packages.

7. July 7 – July 19: Create custom viewing options, such as narrowing the display to subsets of ontologies and grouping annotations into characters.

8. July 21 – August 2: Add the ability to allow users to define characters by hand.

9. August 4 – August 9: User testing and feedback gathering, and subsequent refactoring, correcting, and/or redesigning of my code, similar to step 5.

10. August 11 – August 18: This week will serve as a buffer period for catching up and resolving any outstanding errors.

Detailed Project Plan

To-Do List for Phase 1 (May 19 – May 24)

  • Installed and configured software packages (Java SDK, Eclipse, Subversion)
  • Successfully checked out the Phenote code base and built the project locally
  • Joined mailing lists for Phenote development, bug tracking, and commits
  • Created a home page for the project on the Phyloinformatics wiki, including project timeline and detailed goals
  • Familiarized myself with the existing Phenote classes suggested by my mentor, Jim

To-Do List for Phase 2 (May 27 – June 7)

  • Understand the existing code for working with PATO, located in phenote.datamodel.OboUtil
  • Design the additional methods or classes, if any, that must be added
  • Code the additional methods or classes for parsing and using the PATO
  • Develop algorithms for using PATO in conjunction with curated data to discover common descendents across species
  • Develop algorithms for accessing the character data and representing it in a matrix format

To-Do List for Phase 3 (June 9 - June 13)

  • Determine which existing classes, if any, should be used or extended to develop the matrix interface
  • Design new classes or components to complete the matrix interface
  • Write code to populate the matrix using the data structures created in the previous step
  • Add the ability to access the matrix interface from within Phenote's existing interface

To-Do List for Phase 4 (June 16 - June 27)

  • Implement event listening to keep the matrix up-to-date based on changes to the data elsewhere in Phenote
  • Integrate and test my component with the rest of Phenote - including proper menu and GUI placement, etc.
  • Attend Evolutionary Biology and Ontologies Workshop in the Twin Cities
  • Refactor code for simplicity or generalizability
  • Add more thorough documentation, including JavaDoc comments