PhyloSoC:Browser-based animations for phylogeography
Author and Relevant links
Author: Michael Landis
Mentors: Trevor Bedford, Andrew Rambaut
Project code: http://github.com/mlandis/phylowood
Web service: http://mlandis.github.com/phylowood
Abstract
The fields of phylogeography and biogeography both study the distribution of life over the planet to learn more about evolutionary, ecological, and geographical processes. The fossil record and statistical inference are used to reconstruct ancestral species or population distribution hypotheses over the course of time. Animated graphics clearly communicate such hypotheses to both the expert or the layperson.
We propose to develop a lightweight Javascript and Processing.js package to generate interactive browser-based animations published under an open source license.
Project Goals
- Easy to Interpret
- Converts numerical output into intuitive animations
- Coloring and masking controls to distinguish specific lineages/clades
- Time-calibrated phylogeny and discrete state space reflect modeling assumptions
- Easy to Use
- Requires no software installation
- Intuitive controls (standard movie player controls, phylogenetic movie player controls)
- Establish file parsing for several popular software output formats (e.g. BEAST, Lagrange)
- (possible) Export utilities (e.g. screen capture, video export, Youtube export)
- Extensible
- Javascript published under GPL
- Use other open source libraries
- Scales for large numbers of lineages and states
- Design classes for future extension
Timeline
Week 1: April 23 - April 29 (Community Bonding)
Completed
- Set up GitHub page (https://github.com/mlandis/phylowood).
- Reviewed Javascript using Crockford's "Javascript: The Good Parts".
Week 2: April 30 - May 6
Completed
- Configured Wiki project page.
- Held Skype meeting with Trevor Bedford.
- Discussed performance concerns regarding D3.js vs. Processing.js.
- Discussed visualization concerns regarding Marker- vs. Area-based animations.
- Uploaded sample input to GitHub.
Week 3: May 7 - May 13
Completed
- Trevor and I investigated performance issues related to D3.js and Processing.js. Several sources indicate D3.js works well for small numbers of entities on a large canvas, while Processing.js works well for large numbers of entities on a small canvas. Trevor tested this formally: http://www.trevorbedford.com/archive/may_07_2012.html. Fortunately, much of the file parsing and animation choreography can be done independently of the visualization.
- Created test cases with jsPhyloSvg to determine what could may be repurposed (Newick parsing, tree drawing look usable; I will probably have to create a Tree object from scratch, however).
Week 4: May 14 - May 20
Completed
- Constructed basic HTML page for testing purposes.
- Explored Google Chrome's Javascript Tools environment.
- Met with software developer Shawn Lewis for general advice on Javascript development.
- Resolved FileReader-Chrome incompatibilities for file input.
- Implemented toy classes to set Javascript coding style precedent.
Milestone: Javascript libraries and development environment are tested and approved.
Completed ahead of schedule:
- Established file format.
- Implemented file parsing to extract geographical and ancestral state reconstructions from file.
- Stored geography and state data to arrays.
- Explored OpenLayers as a way to dynamically download and display the map relevant to the user's input. Perhaps this will be an alternative to cases where the user cannot supply the map.
To do:
- Implement file parsing to extract phylogenetic structure (probably using jsPhyloSvg).
- Implement Tree and Node objects. I may be able to use jsPhyloSvg for this, but I would need to add new attributes (color, geography, etc.) and new methods (sorting by Node depth).
Week 5: May 21 - May 27
Completed:
- Implemented Newick string parser and phylogeny data structure with two
orderings: time-ordered and pruning-ordered. I was unable to easily modify jsPhyloSvg, so I programmed this from scratch instead.
- Implemented map-fetching method using OpenLayers to generate static
map backgrounds.
- Decided with Trevor and Andrew to implement animations using D3.js.
To do:
- Test interactions between D3.js and various JS map libraries
(OpenLayers, Leaflet, etc). For future development, we are interested in a library that exhibits the best performance, user controls, style, and stability.
- Implement basic map-fetching method and tip data onto the map using
the preferred library.
Week 6: May 28 - June 3
Completed:
- Implemented Polymaps.js method to fetch Cloudmade map based on
geographical coordinates of dataset. Currently using a "animation format" toy dataset.
- Implemented D3.js method to place marker objects on map, and cluster
based on geographical coordinates using d3.layout.pack.
- Added marker size rescaling and repositioning when map is zoomed or
panned.
To do:
- Write a method to convert the phylogeny represented as a time-ordered
Node array into the animation format defined in the above test method.
- d3.layout.pack geographical repositioning currently has a few kinks to
work out -- should not be difficult.
- Add a clock object and control widget, whose time ranges from [0, tree
height]. This will be used to query nodes from the animation schedule.
- (optional) Upload test page with toy dataset for public access to
generate feedback.
Week 7: June 4 - June 10
Completed:
- Added a method to convert the input data into a display-ready format.
- Worked out many visual issues for clustering markers using
d3.layout.pack. There remains some fine-tuning for how to handle pan and zoom events with d3.layout.pack.
- Added a jQuery media seeker and media controls for the control widget.
Did not complete:
- Need to implement the clock object, which is a simple stopwatch that
responds to control widget events.
- Decided to delay publishing the demo page until Midterm Evaluation.
To complete:
- Complete clock object (see above).
- Assign colors to phylogenetic lineages (currently, they are drawn at
random).
- Assign seeker bar to phylogeny widget and sync with clock object.
Week 8: June 11 - June 17
Completed:
- Mapped simple stopwatch methods from the media toolbar to the clock
object: play, stop, pause, ffwd, reverse, start, end.
- Scheduled SVG element removal according to phylogeny events (i.e.
branch ends).
- Implemented simple method to assign colors to nodes as a function of
their position in the phylogeny.
- Merged Trevor's additions to the project (visual improvements, user
friendliness, add'l demo data example, code clean-up)
Did not complete:
- Did not assign a seeker bar to the phylogeny yet. I need to access SVG
properties from the jsPhyloSvg object that are not exposed in the API, so I will need to write my own access methods.
To complete:
- Extend jsPhyloSvg API to accept colors as branch arguments and to
return SVG dimensions. This will allow me to correspond animation markers and the media seeker with the phylogeny.
- Instantiate SVG markers at the appropriate times (i.e. at divergence
events, including the root divergence event) with D3.
Week 9: June 18 - June 24
Completed:
- jsPhyloSvg wasn't quite as flexible as I'd hoped, so I wrote a method
to plot the phylogeny. This allowed me to color-coordinate the lineages between the phylogeny and the animation markers. It also allowed me to synchronize the movie slider with a seeker on top of the phylogeny plot. Whew!
- Markers have their "visibility" attribute turned on and off at the
appropriate times. This makes the markers flash in and out of existence according to when their corresponding lineages exist.
- I ran into some browser-specific security issues with FileReader and
http requests, so I am now handling input as raw text in a textarea. Users may also populate the textarea using example datasets.
- I am clustering markers into biogeographical areas using
d3.layout.force, but it starts to stagger when handling large datasets. Partly, this is because .force does not cluster instantaneously, but by "settling" particles into place, which requires lots of pairwise computations per tick.
To complete:
- Explore new clustering methods, preferably something static.
- If time allows, explore new visualization methods -- e.g. histograms
at each area (ugly, easy, and fast) or polygons (pretty, difficult, and potentially slower).
Week 10: June 25 - July 1
To do
- Add MRCA ancestral distribution to map.
Week 11: July 2 - July 8
To do
- Add ancestral distributions according to Tree and Clock to map.
Week 12-13: July 9 - July 22 (Midterm Evaluation)
To do
- Create an AreaSchedule for the phylogeny.
- Animate distribution transitions over time on map.
Week 14-15: July 23 - August 6
To do
- Add clade filters, colors, transparency effects to control panel.
- General debug and clean-up.
- If time remains, add some of the features below.
- Thorough progress review with mentors.
Week 16: August 6 - August 12
To do
- Write user manual.
- Conduct user testing and respond to feedback.
Week 17: August 13 - August 19 (Project Ends; Final Evaluation)
To do
- Final debug and code clean-up.
Features to be added
The above timeline is a conservative estimate of how long tasks will take. I have cushioned the schedule towards the end of the project in case I need to recoup misspent time. If time remains before the 12 week Summer of Code is over, I have several other ideas to improve the proposed package:
- Extend list of supported input file formats.
- Setup basic web service on cteg.berkeley.edu (or any available server)
- Tool to export animation as a video file or to Youtube
- Add user annotations on top of animation (text boxes, drawn arrows)
- Incorporate overlay transparencies (paleoclimatological data)
- Incorporate tectonic reconstructions (continental drift)
Deliverables
Source code
User manual
Sample animation
GitHub repository
(Web service)
Involved toolkits or projects
Code: Javascript, Processing.js, HTML5, D3.js, Polymaps.js, jQuery
Maps: NaturalEarthData, Pplates
Input: BEAST, Lagrange, PaleoDB, (others may be added easily)
Examples, inspirations: Animaps, SPREAD, Google Earth, GlobalForestWatch