PhyloSoC:Matrix display of phenotype annotations using ontologies in Phenote

From Phyloinformatics
Revision as of 22:23, 19 September 2008 by Mwallinga (talk) (Matrix Viewer Demonstration)
Jump to: navigation, search

Project Description

Phenote is an application which facilitates the annotation of biological phenotypes using ontologies. While initially developed for model organism mutants, the Phenoscape project has extended Phenote to efficiently annotate evolutionary phenotypes. Phenotype annotations are currently edited within Phenote in a list interface, allowing new phenotypic states to be annotated without pre-defining characters. However, a way to view annotations in a matrix form, showing character values grouped by taxon and character, would be familiar to evolutionary biologists and aid in evaluation of taxonomic annotation coverage and phylogenetic patterns.

This project proposes the creation of this matrix view in Phenote. The interface would be implemented using Java Swing, most likely through the development of reusable customized component classes. Model classes will implement algorithms for populating this view by consolidating phenotype annotations into characters using the ontology hierarchical structure. This will be accomplished by finding values which descend from the same attribute within the PATO phenotype ontology, as implemented in Phenote's prototype NEXUS exporter.

Several additional features will give the matrix view increased functionality. First, data export compatibility will be added to increase Phenote's compatibility with other matrix-oriented evolutionary software. The matrix view will have flexibility in sorting and grouping results, such as the ability to narrow the display to subsets of particular ontologies. The view will be further customizable via user-definable options for how annotations are grouped into characters. Finally, users will be able to define characters by hand within the Swing interface.

Final Notes: Midway through the summer, the decision was made to rewrite the basic functionality to improve the overall design and promote generality so that other groups would be able to use the matrix viewer for data other than phenotype annotations. The original project description (above) has been left intact, but the timeline (below) has been changed to reflect the actual development cycle.

Project Timeline

1. May 19 – May 24: Initial setup, environment configuration

2. May 27 – June 1: Familiarization with existing code and beginning work on algorithms.

3. June 2 – June 8: Develop the program's model classes and create appropriate data structures for organizing and representing the data in matrix form.

4. June 9 – June 29: Development of an interface component to hold the matrix view. Attend the Evolutionary Biology and Ontologies Workshop in the Twin Cities.

5. June 30 – July 12: Implement event handling and integration to receive updated information from other parts of Phenote.

6. July 14 – August 3: Refactor and rewrite large portions of the model classes to make them more generalizable. Make the necessary edits to the controller and GUI classes to use the new model classes.

7. August 4 - August 18: With the new classes in place, work out testing, debugging, and fixing errors. Add additional features and formatting options as time allows.

Detailed Project History

Phase 1 (May 19 – May 24)

  • Installed and configured software packages (Java SDK, Eclipse, Subversion)
  • Successfully checked out the Phenote code base and built the project locally
  • Joined mailing lists for Phenote development, bug tracking, and commits
  • Created a home page for the project on the Phyloinformatics wiki, including project timeline and detailed goals
  • Familiarized myself with the existing Phenote classes suggested by my mentor

Phase 2 (May 27 – June 1)

  • Continued studying the existing code for working with PATO, located in phenote.datamodel.OboUtil
  • Designed and coded the additional methods or classes, if any, that must be added
  • Developed algorithms for using PATO in conjunction with curated data to discover common descendents across species

Phase 3 (June 2 - June 8)

  • Revised my algorithms based on feedback from Jim
  • Wrote code for the additional methods and classes that must be added to create the matrix
  • Updated my project page

Phase 4 (June 9 - June 29)

  • Determined which existing classes, if any, should be used or extended to develop the matrix interface
  • Designed new classes or components to complete the matrix interface
  • Added the ability to access the matrix interface from within Phenote's existing interface
  • Wrote code to populate the matrix GUI component
  • Wrote a custom renderer class to define the display of my matrix data
  • Added JavaDoc documentation for my code

Phase 5 (June 30 - July 12)

  • Finished writing JavaDoc comments for my existing code
  • Fixed event handling errors so that the matrix data can be kept up-to-date with the rest of Phenote
  • Fixed several other bugs including some null pointer issues

Phase 6 (July 14 - August 3)

  • Substantially rewrote the data model to be more generalizable
  • Revised the controller and GUI component classes to use my new model
  • Rewrote model classes for matrix rows and columns to use hash codes and sets
  • Fixed miscellaneous bugs, included a serious display issue with the JTable

Phase 7 (August 4 - August 18)

  • Improved the matrix table display (added headings, created more descriptive cell contents, etc.)
  • Removed debugging code (extraneous println statements, comments, unused files etc.)
  • Added additional JavaDoc documentation and other comments

Matrix Viewer Demonstration

To use the matrix viewer in Phenote, the user must first add information to the main annotation table. A small, simple example is shown below:

Annotation table.JPG

Once the data is prepared, access the matrix viewer from its location under the View menu:


Here is the matrix that corresponds to the data. It does not use traditional numeric entries, but instead uses a combination of the annotation entries and the underlying ontology entries to form complete descriptions of each character. For example, if a character requires a second entity, the matrix viewer automatically appends the value of the Additional Entity column in the annotation table to the standard Quality entry. Also, if a character requires count data, the matrix viewer will replace the Quality value of "count" with the actual numeric data, found in the Count column in the annotation table. Also, the column heading in the matrix are contextual, using combinations of the entity and quality annotations to create meaningful descriptions.


Notice that in the second column of the matrix, only one entry has a value of "present?" For the sake of example, let's change that value to "absent" in the main annotation table:

Make change.JPG

Switching back to the matrix viewer screen illustrates that the matrix has automatically been updated with the new value of "absent;" the user does not need to close and reopen the matrix viewer to see this change:

Updated matrix.JPG

Future work on the matrix viewer will add to this basic feature set, such as adding advanced options for sorting, grouping, and filtering the matrix and providing the user with customizable display settings.