PhyloSoC:Browser-based animations for phylogeography

From Phyloinformatics
Jump to: navigation, search

Author and Relevant links

Author: Michael Landis

Mentors: Trevor Bedford, Andrew Rambaut

Project code: http://github.com/mlandis/phylowood

Web service: http://mlandis.github.com/phylowood

Abstract

The fields of phylogeography and biogeography both study the distribution of life over the planet to learn more about evolutionary, ecological, and geographical processes. The fossil record and statistical inference are used to reconstruct ancestral species or population distribution hypotheses over the course of time. Animated graphics clearly communicate such hypotheses to both the expert or the layperson.

We propose to develop a lightweight Javascript and Processing.js package to generate interactive browser-based animations published under an open source license.

Project Goals

  1. Easy to Interpret
    • Converts numerical output into intuitive animations
    • Coloring and masking controls to distinguish specific lineages/clades
    • Time-calibrated phylogeny and discrete state space reflect modeling assumptions
  2. Easy to Use
    • Requires no software installation
    • Intuitive controls (standard movie player controls, phylogenetic movie player controls)
    • Establish file parsing for several popular software output formats (e.g. BEAST, Lagrange)
    • (possible) Export utilities (e.g. screen capture, video export, Youtube export)
  3. Extensible
    • Javascript published under GPL
    • Use other open source libraries
    • Scales for large numbers of lineages and states
    • Design classes for future extension

Timeline

Week 1: April 23 - April 29 (Community Bonding)

Completed

Week 2: April 30 - May 6

Completed

  • Configured Wiki project page.
  • Held Skype meeting with Trevor Bedford.
  • Discussed performance concerns regarding D3.js vs. Processing.js.
  • Discussed visualization concerns regarding Marker- vs. Area-based animations.
  • Uploaded sample input to GitHub.

Week 3: May 7 - May 13

Completed

  • Trevor and I investigated performance issues related to D3.js and Processing.js. Several sources indicate D3.js works well for small numbers of entities on a large canvas, while Processing.js works well for large numbers of entities on a small canvas. Trevor tested this formally: http://www.trevorbedford.com/archive/may_07_2012.html. Fortunately, much of the file parsing and animation choreography can be done independently of the visualization.
  • Created test cases with jsPhyloSvg to determine what could may be repurposed (Newick parsing, tree drawing look usable; I will probably have to create a Tree object from scratch, however).

Week 4: May 14 - May 20

Completed

  • Constructed basic HTML page for testing purposes.
  • Explored Google Chrome's Javascript Tools environment.
  • Met with software developer Shawn Lewis for general advice on Javascript development.
  • Resolved FileReader-Chrome incompatibilities for file input.
  • Implemented toy classes to set Javascript coding style precedent.

Milestone: Javascript libraries and development environment are tested and approved.


Completed ahead of schedule:

  • Established file format.
  • Implemented file parsing to extract geographical and ancestral state reconstructions from file.
  • Stored geography and state data to arrays.
  • Explored OpenLayers as a way to dynamically download and display the map relevant to the user's input. Perhaps this will be an alternative to cases where the user cannot supply the map.


To do:

  • Implement file parsing to extract phylogenetic structure (probably using jsPhyloSvg).
  • Implement Tree and Node objects. I may be able to use jsPhyloSvg for this, but I would need to add new attributes (color, geography, etc.) and new methods (sorting by Node depth).

Week 5: May 21 - May 27

Completed:

  • Implemented Newick string parser and phylogeny data structure with two

orderings: time-ordered and pruning-ordered. I was unable to easily modify jsPhyloSvg, so I programmed this from scratch instead.

  • Implemented map-fetching method using OpenLayers to generate static

map backgrounds.

  • Decided with Trevor and Andrew to implement animations using D3.js.


To do:

  • Test interactions between D3.js and various JS map libraries

(OpenLayers, Leaflet, etc). For future development, we are interested in a library that exhibits the best performance, user controls, style, and stability.

  • Implement basic map-fetching method and tip data onto the map using

the preferred library.

Week 6: May 28 - June 3

Completed:

  • Implemented Polymaps.js method to fetch Cloudmade map based on

geographical coordinates of dataset. Currently using a "animation format" toy dataset.

  • Implemented D3.js method to place marker objects on map, and cluster

based on geographical coordinates using d3.layout.pack.

  • Added marker size rescaling and repositioning when map is zoomed or

panned.


To do:

  • Write a method to convert the phylogeny represented as a time-ordered

Node array into the animation format defined in the above test method.

  • d3.layout.pack geographical repositioning currently has a few kinks to

work out -- should not be difficult.

  • Add a clock object and control widget, whose time ranges from [0, tree

height]. This will be used to query nodes from the animation schedule.

  • (optional) Upload test page with toy dataset for public access to

generate feedback.

Week 7: June 4 - June 10

Completed:

  • Added a method to convert the input data into a display-ready format.
  • Worked out many visual issues for clustering markers using

d3.layout.pack. There remains some fine-tuning for how to handle pan and zoom events with d3.layout.pack.

  • Added a jQuery media seeker and media controls for the control widget.


Did not complete:

  • Need to implement the clock object, which is a simple stopwatch that

responds to control widget events.

  • Decided to delay publishing the demo page until Midterm Evaluation.


To complete:

  • Complete clock object (see above).
  • Assign colors to phylogenetic lineages (currently, they are drawn at

random).

  • Assign seeker bar to phylogeny widget and sync with clock object.

Week 8: June 11 - June 17

Completed:

  • Mapped simple stopwatch methods from the media toolbar to the clock

object: play, stop, pause, ffwd, reverse, start, end.

  • Scheduled SVG element removal according to phylogeny events (i.e.

branch ends).

  • Implemented simple method to assign colors to nodes as a function of

their position in the phylogeny.

  • Merged Trevor's additions to the project (visual improvements, user

friendliness, add'l demo data example, code clean-up)


Did not complete:

  • Did not assign a seeker bar to the phylogeny yet. I need to access SVG

properties from the jsPhyloSvg object that are not exposed in the API, so I will need to write my own access methods.


To complete:

  • Extend jsPhyloSvg API to accept colors as branch arguments and to

return SVG dimensions. This will allow me to correspond animation markers and the media seeker with the phylogeny.

  • Instantiate SVG markers at the appropriate times (i.e. at divergence

events, including the root divergence event) with D3.

Week 9: June 18 - June 24

Completed:

  • jsPhyloSvg wasn't quite as flexible as I'd hoped, so I wrote a method

to plot the phylogeny. This allowed me to color-coordinate the lineages between the phylogeny and the animation markers. It also allowed me to synchronize the movie slider with a seeker on top of the phylogeny plot. Whew!

  • Markers have their "visibility" attribute turned on and off at the

appropriate times. This makes the markers flash in and out of existence according to when their corresponding lineages exist.

  • I ran into some browser-specific security issues with FileReader and

http requests, so I am now handling input as raw text in a textarea. Users may also populate the textarea using example datasets.

  • I am clustering markers into biogeographical areas using

d3.layout.force, but it starts to stagger when handling large datasets. Partly, this is because .force does not cluster instantaneously, but by "settling" particles into place, which requires lots of pairwise computations per tick.


To complete:

  • Explore new clustering methods, preferably something static.
  • If time allows, explore new visualization methods -- e.g. histograms

at each area (ugly, easy, and fast) or polygons (pretty, difficult, and potentially slower).

Week 10: June 25 - July 1

To do

  • Add MRCA ancestral distribution to map.

Week 11: July 2 - July 8

To do

  • Add ancestral distributions according to Tree and Clock to map.

Week 12-13: July 9 - July 22 (Midterm Evaluation)

To do

  • Create an AreaSchedule for the phylogeny.
  • Animate distribution transitions over time on map.

Week 14-15: July 23 - August 6

To do

  • Add clade filters, colors, transparency effects to control panel.
  • General debug and clean-up.
  • If time remains, add some of the features below.
  • Thorough progress review with mentors.

Week 16: August 6 - August 12

To do

  • Write user manual.
  • Conduct user testing and respond to feedback.

Week 17: August 13 - August 19 (Project Ends; Final Evaluation)

To do

  • Final debug and code clean-up.

Features to be added

The above timeline is a conservative estimate of how long tasks will take. I have cushioned the schedule towards the end of the project in case I need to recoup misspent time. If time remains before the 12 week Summer of Code is over, I have several other ideas to improve the proposed package:

  • Extend list of supported input file formats.
  • Setup basic web service on cteg.berkeley.edu (or any available server)
  • Tool to export animation as a video file or to Youtube
  • Add user annotations on top of animation (text boxes, drawn arrows)
  • Incorporate overlay transparencies (paleoclimatological data)
  • Incorporate tectonic reconstructions (continental drift)

Deliverables

Source code

User manual

Sample animation

GitHub repository

(Web service)

Involved toolkits or projects

Code: Javascript, Processing.js, HTML5, D3.js, Polymaps.js, jQuery

Maps: NaturalEarthData, Pplates

Input: BEAST, Lagrange, PaleoDB, (others may be added easily)

Examples, inspirations: Animaps, SPREAD, Google Earth, GlobalForestWatch