From Phyloinformatics
Revision as of 21:49, 13 May 2007 by Panda linda (talk)
Jump to: navigation, search

This project is part of the 2007 Phyloinformatics Summer of Code which is part of the Google Summer of Code project. This web page will serve as the central resource for information relation to the project "Visualizing Phylogeographic Information" that is being developed by Yi-Hsin Erica Tsai.

The goal of this project is to develop a web based application that generates geographic maps of DNA haplotype data that are often used in the course of phylogeographic analysis. The application will create maps of pie charts viewable through Google Earth that show the spatial distributions of each haplotype, the frequency of each haplotype in each population, and the number of samples included per population. This program may also be useful to people outside of evolutionary biology; anyone who has need of visualizing frequency data on maps may find this application helpful.


Project Overview

Phylogeography has enjoyed an explosion of data from research on migration patterns of organisms to studies of population genetics and population structure. However, there is still no easy way to generate maps of DNA haplotype frequency data. Imagine a map with all sample locations marked, and centered on each location a pie (or stacked bar) chart is visible showing the frequency of each haplotype within the population. The size of each pie is proportional to the amount of samples genotyped in that population. Often these maps are drawn by hand in Adobe Illustrator or other difficult to use, proprietary map drawing programs (e.g. ArcGIS). In addition, this procedure becomes unfeasible with larger data sets. This method does not lend itself to viewing and analyzing multiple data sets simultaneously as is becoming more common in comparative phylogeography. This software package implements such a viewer. This program would have broader applications than just to genetic data, any sort of frequency based information with a geographical component (e.g. % of sunny, rainy, snowy days) could be visualized.

The product would be a web based application with ties to Google Maps or Google Earth. The web application would include a data manager that would export KML. The KML would be used within the browser for visualization using Google Maps or could be exported to integrate with Google Earth. There are three main components to develop. First, the data manager is needed to import, edit, and export data. Second, a visualization tool will generate the phylogeographic maps. Third, the visualization tool will be expanded to display multiple datasets simultaneously; for instance to compare haplotype frequency distributions of multiple loci or haplotype frequencies of multiple species. This program will allow manipulation of data within the application (e.g. grouping all rare haplotypes together, or only showing a subset of the populations) to generate new phylogeographic maps without need for creating and loading new input files. The goal for the program is to allow for easy visualization of phylogeographic data on a map and to facilitate subsequent spatial data analysis.

Overall Project Plan

Phase 0: Getting development environment set up

  • How to use subversion.
  • Tutorials in PHP.
  • Install PHP and Apache, try it out on my laptop (create my development environment).
  • Learn how to download and get working a PHP application. Learn how web apps work.
  • Get a "hello world" type program running.

Phase 1: Exploratory phase

  • Learn how to embed maps on a webpage.
  • Learn the relationship between Google Earth and Google Maps.
  • How are they the same, how are they different? What can you do with one that you can't do with the other?
  • Explore KML and general XML.
  • Explore Google Earth and Google Maps APIs.

Phase 2: Finalize design

  • Page by page description of what the user sees.
  • How to input data. Are they going to upload files, input in a text box? What format?
  • How to export data. Format? Data persistence? Can users store data, results, maps, etc.?
  • What is the viewer? Google earth? Google maps?

Phase 3: Creating a functional prototype for the viewer

  • Write an application that:
    • Reads in sample data.
    • Generates the appropriate KML.
    • Displays the data on a map.

Phase 4: Creating a data manager

  • Write an application that:
    • Imports data.
    • Manipulates data (dynamically?).
    • Export map data or KML or some other format.

Phase 5: Integrate the viewer with the data manager