PhyloSoC: PhyloGeoRef Java library for mapping phylogenetic trees in KML

From Phyloinformatics
Jump to: navigation, search

Abstract :

The PhyloGeoRef library enables representation of phylogenetic data in a geospatial format. In biology, phylogenetics is the study of evolutionary relatedness among groups of organisms (e.g. species, populations), which is discovered through molecular sequencing data and morphological data matrices. The current version of phyloGeoRef is capable of constructing kml's* that have the capability to display a phylogenetic tree. A phylogenetic tree as the name suggest is a simple tree diagram which shows the relationships among various biological species based upon similarities and differences in their genetic characteristics.


Author

Apurv Verma

Main Deliverables

1) Enhanced data visualization support in PhyloGeoRef.

2) Encapsulating additional statistical information about the phylogenetic tree in the KML.

3) Support for NeXML.


Weekly Plan

Week 1: 24th May - 30thMay

1) Building class NeXMLEngine which will use org.nexml library to parse NeXML file and create a Phylogeny object out of it.

2) Understand how metadata is embedded in an NeXML file and how it can be extracted.


Week 2: 31st May - 6th June

1) Was able to parse a simple NeXML file and extract information from it.

2) Read the org.forester.phylogeny packages which are used for construction of Phylogeny objects.

3) Have postponed the implementation of Roderic Page technique to a future week when I would be dealing with the enhancement of kml.


Week 3: 7th June - 13th June

1) Converted a simple NeXML file to a Phylogeny object

2) Posted one blog entry

3) Metadata has still not been handled for NeXML files

4) Pushed code on github


Week 4: 14th June - 20th June

1) Write classes for writing more information into the kml (not finished)

2) Test the code on the mammals csv file sent by David (done)

3) Handle metadata in csv and nwk files. (done)

4) Also figure out a mechanism for validating the csv file. The program should automatically check for color conflicts in clades and choose the correct color and continue running. (not done)


Week 5: 21st June - 27th June

1) Prepare a UniversalTreeReader. (done)

2) Prepare a UniversalMetadataReader (done)

3) Combine the above two to create a GrandUnifiedReader (done) which takes in any kind of files and produce the phylogeny.

4) Prepare a schematic overview of the work, the link to which can be found on the blog. (done)


Week 6: 28th June - 4th July

1) Validating the phylogeny.

2) Propagating the assigned coordinate and color values up the tree. (Phylogenification)

3) Begin to write a KmlWriter.


Week 7: 5th July - 11th July

1) Finish the StaticKmlPainter class which writes the kml level wise.(done)

2) Add placemarks for Hypothetical Taxonomic Units. (done)

3) Create styled edges which get selected when the mouse moves over them. (done)

4) Add regions with the tip nodes which provides a LOD (Level of Detail Feature) (done)

5) Draw lines following the curvature of earth. (done)

Week 8: 12th July - 18th July

1) Began to write HTMLParlor class that would prepare the html to be displayed along with each node.

2) Read about animations in kml.

Week 9: 19th July - 25th July

1) Added implementation for calculating the corrected centroid algorithm.

2) Discovered new features in kml like Multi-Geometries.

3) Separated implementation for Levelwise (done) and Hierarchical (not done) design of kml

4) Added utility for compressing the kml file as kmz.

Week 10: 26th July - 1st August

1) Tested the centroid algorithm, had many errors which I fixed up.

2) Added weighted mean in it also, so the root is unequally placed between the child nodes.

3) Added weighted mean for the color to be propagated up the tree.

Week 11: 1st August - 7th August

Week 12: 7th August - 13th August