Difference between revisions of "PhyloSoC: Automated submission of rich data to TreeBASE"

From Phyloinformatics
Jump to: navigation, search
(Timeline for Project Plan)
(Timeline for Project Plan)
Line 24: Line 24:
 
== Timeline for Project Plan ==
 
== Timeline for Project Plan ==
 
<font size="2.5"><b>4/25-5/22: Community Bonding Period</b></font><br />
 
<font size="2.5"><b>4/25-5/22: Community Bonding Period</b></font><br />
Build core code base locally. <font color="green">Completed</font><br />
+
<b><i>Goal 1: </i></b>Build core code base locally. <font color="green">Completed</font><br />
 
<font size="2.5"><b>5/23-5/29: Week One</b></font><br />
 
<font size="2.5"><b>5/23-5/29: Week One</b></font><br />
 
<font size="2.5"><b>5/30-6/05: Week Two</b></font><br />
 
<font size="2.5"><b>5/30-6/05: Week Two</b></font><br />

Revision as of 16:14, 23 May 2011

Project Author and Mentors

Check out the GSoC 2011 TreeBASE project blog!

Also, check out TreeBASE!

Abstract

TreeBASE acts as a archive for phylogenetic analyses. The current submission of data to TreeBASE is via NEXUS files. However, this format results in a clunky user interface and does not allow for automated submission of metadata or additional annotations to be added. This project will take on the task of accepting NeXML files to TreeBASE so that the submission process of metadata to be easily submitted and so that new annotations of the metadata can be displayed in a user-friendly manner.

Background and Purpose

At present time, TreeBASE serves as an archive for phylogenetic data. The user is able to search the database for different studies based on author, study ID, and other keywords found throughout the work. One is then able to browse through the metadata of the study. The user can search for taxa of interest based on several identifiers and results can link you to NCBI’s taxonomy browser or the Universal Biological Indexer and Organizer. One can also view the matrices used in the analyses, which provides a link to the original NEXUS file and a list of the sequences, but only the first 30 characters are visible. The trees displayed for a particular study can be further refined by topology type. All trees can be viewed using PhyloWidget. Although useful, there are numerous additional features that TreeBASE could be included that would maximize its usefulness and minimize the number of clicks the user has to make to navigate throughout the site.

One example of an annotation that would be useful to the TreeBASE user community include linking sequence data to the Genbank accession number so one could be directed to NCBI to directly access sequence data in order to utilize this information for future analyses, as well as including the geocoding the locality coordinates of the organisms included in the study. These are only a few of many annotations that could be incorporated into TreeBASE to improve the utility of this resource. In order to expand the annotations of the data in TreeBASE, the submission process of TreeBASE data must be further refined to be more user-friendly. My project for Google Summer of Code 2011 will allow for the submission of phylogenetic data to TreeBASE so that both the data and metadata are exported in a way that would simplify the user’s interaction with the website. One example of an annotation that would be useful to the TreeBASE user community include linking sequence data to the Genbank accession number so one could be directed to NCBI to directly access sequence data in order to utilize this information for future analyses, as well as including the geocoding the locality coordinates of the organisms included in the study. These are only a few of many annotations that could be incorporated into TreeBASE to improve the utility of this resource. In order to expand the annotations of the data in TreeBASE, the submission process of TreeBASE data must be further refined to be more user-friendly. My project for Google Summer of Code 2011 will allow for the submission of phylogenetic data to TreeBASE so that both the data and metadata are exported in a way that would simplify the user’s interaction with the website.

Project Goals

By the end of the term, we would like to accomplish the following:

Timeline for Project Plan

4/25-5/22: Community Bonding Period
Goal 1: Build core code base locally. Completed
5/23-5/29: Week One
5/30-6/05: Week Two
6/06-6/12: Week Three
6/13-6/19: Week Four
6/20-6/26: Week Five
6/27-7/03: Week Six
7/04-7/10: Week Seven
7/11-7/17: Week Eight
7/18-7/24: Week Nine
7/25-7/31: Week Ten
8/01-8/07: Week Eleven
8/08-8/14: Week Twelve
8/15: Pencils Down
Submit Final Evaluations