Difference between revisions of "Phyloinformatics Summer of Code 2007"

From Phyloinformatics
Jump to: navigation, search
(News)
 
(83 intermediate revisions by 15 users not shown)
Line 1: Line 1:
We are applying to the [http://code.google.com/soc Google Summer of Code] (GSoC) program for the first time this year. On this page we are collecting ideas, possible projects, prerequisites, possible solution approaches, mentors, other people or channels to contact for more information or to bounce ideas off of, etc
+
We applied to the [http://code.google.com/soc Google Summer of Code] (GSoC) program for the first time in 2007. On this page we collected ideas, possible projects, prerequisites, possible solution approaches, mentors, other people or channels to contact for more information or to bounce ideas off of, etc
  
==News==
 
  
* 23:13, 12 April 2007 (EDT) Students were accepted last night. [http://code.google.com/soc/nescent/about.html We received 11 slots]! Even though that's hardly uplifting if you are one the 56 we had to decline, it's much more than we had hoped in our wildest dreams. Thank you again to everyone who applied, it is your enthusiasm for our organization's ideas that gave us this many slots. [[User:Hlapp|Hlapp]]
 
* 17:35, 28 March 2007 (EDT) Student applications closed yesterday at noon (EDT). We have received 67 applications! Thanks to everyone who applied, we are humbled by your enthusiasm. The accepted applications will be published by Google on April 11. [[User:Hlapp|Hlapp]]
 
* 00:26, 19 March 2007 (EDT) The announcement about us participating in the program have gone out, [http://evol.mcmaster.ca/~brian/netevoldir/Other/NESCent.Phyloinformatics for example to EvolDir], but also Bioinformatics.org BB, local lists, and more. [[User:Hlapp|Hlapp]]
 
* 12:39, 15 March 2007 (EDT) We have been accepted! The [mailto:phylosoc%40nescent%2eorg phylosoc@nescent.org] mailing list is on-line and open to anyone to post questions or suggestions. I also published the [http://docs.google.com/Doc?id=dhdjhbvd_6dc5b6j application document] (after obfuscating email addresses). [[User:Hlapp|Hlapp]]
 
* 17:32, 12 March 2007 (EDT) Finally pulled everything together, and submitted application! [[User:Hlapp|Hlapp]]
 
* 21:20, 2 March 2007 (EST) Created page, added a couple of links, outline, and started filling in some bits. [[User:Hlapp|Hlapp]]
 
  
 
==Why==
 
==Why==
Line 19: Line 12:
 
==Accepted Projects==
 
==Accepted Projects==
  
''Note that the accepted projects, students, and their mentors are also [http://code.google.com/soc/nescent/about.html published by Google].''
+
''Note that the accepted projects, students, and their mentors are also [http://code.google.com/soc/2007/nescent/about.html published by Google].''
  
 
===A Perl-based Command Line Interface to a Topological Query Application for BioSQL in Support of High Throughput Classification and Analysis of LTR Retrotransposons in Plant Genomes===
 
===A Perl-based Command Line Interface to a Topological Query Application for BioSQL in Support of High Throughput Classification and Analysis of LTR Retrotransposons in Plant Genomes===
Line 25: Line 18:
 
This is an application for the [[#Topological_query_application_for_BioSQL|Topological query application for BioSQL project idea]].  
 
This is an application for the [[#Topological_query_application_for_BioSQL|Topological query application for BioSQL project idea]].  
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' I will use PERL to create a set of command line programs for topological queries in BioSQL. The goal of this project is to create an interface that is suitable for high throughput creation and modification of SQL based phylogenies. I will use this interface to further my research on the classification of plant LTR retrotransposons.
  
 
'''Student:''' [http://jestill.myweb.uga.edu/ James Estill]
 
'''Student:''' [http://jestill.myweb.uga.edu/ James Estill]
Line 31: Line 24:
 
'''Mentor(s):''' Hilmar Lapp (primary), Weigang Qiu, Bill Piel, Mike Muratet (secondary)
 
'''Mentor(s):''' Hilmar Lapp (primary), Weigang Qiu, Bill Piel, Mike Muratet (secondary)
  
'''Project homepage:''' [http://jestill.myweb.uga.edu/gsoc.html Topological Query Application for BioSQL]
+
'''Project Blog:''' http://phylosoc2007jestill.blogspot.com/
  
'''Source code:'''  
+
'''Project homepage:''' [[PhyloSoC:Command_Line_Topological_Query_Application_for_BioSQL|Command Line Topological Query Application for BioSQL]]
  
===Developing user-oriented, standards-based phylogenomics tools: PhyloSOAP and PhyloWidget===
+
'''Source code:''' http://code.google.com/p/phylosoc2007jestill/
 +
 
 +
===Phyloinformatics Web Tools: PhyloWidget===
  
 
This application proposes a new project that builds upon the [[#Topological_query_application_for_BioSQL|Topological query application for BioSQL project idea]].  
 
This application proposes a new project that builds upon the [[#Topological_query_application_for_BioSQL|Topological query application for BioSQL project idea]].  
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' The purpose of this project is to (a) create a user-friendly and web-accessible GUI for creating phylogenetic tree queries, and (b) implement a SOAP-based client and server protocol for transmitting phylogenetic queries and results. <!-- try to limit this to about 50 words -->
  
'''Student:''' [http://pantheon.yale.edu/~gej5/ Gregory Jordan]
+
'''Student:''' [http://www.andrewberman.org/ Gregory Jordan]
  
 
'''Mentor(s):''' Bill Piel (primary), Hilmar Lapp (secondary)
 
'''Mentor(s):''' Bill Piel (primary), Hilmar Lapp (secondary)
  
'''Project homepage:'''  
+
'''Project homepage:''' http://www.phylowidget.org/
  
'''Source code:'''  
+
'''Source code:''' http://code.google.com/p/phylowidget/source
  
===Application for Phyloinformatics project===
+
===APIs for BioJava===
  
This is an application for the [[#Create_a_usable_phyloinformatics_API_for_BioJava|Create a usable phyloinformatics API for BioJava project idea]].  
+
This is an application for the [[#Create_a_usable_phyloinformatics_API_for_BioJava|Create a usable phyloinformatics API for BioJava project idea]].
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract: ''' We are planning to develop APIs for BioJava. Especially, we will focus on the phylogeny reconstructing methods such as UPGMA, Maximum Parsimony, and Maximum Likelihood
  
'''Student:'''  
+
'''Student: ''' [mailto:blee34@mail.gatech.edu Bohyun Lee]
  
'''Mentor(s):'''
+
'''Mentor(s): ''' Richard Holland
  
'''Project homepage:'''  
+
'''Project homepage: ''' http://biojava.org/wiki/BioJava:PhyloSOC07
  
'''Source code:'''  
+
'''Source code:''' [http://code.open-bio.org/svnweb/index.cgi/biojava/browse/biojava-live/trunk/src/org/biojavax/bio/phylo org.biojavax.bio.phylo package (SVN)]
  
 
===Multi-language bindings to the C++ NEXUS Class Library===
 
===Multi-language bindings to the C++ NEXUS Class Library===
  
This is an application for the [[#Enable_multi-language_bindings_to_the_C.2B.2B_NEXUS_Class_Library|Enable multi-language bindings to the C++ NEXUS Class Library project idea]].  
+
This is an application for the [[#Enable_multi-language_bindings_to_the_C.2B.2B_NEXUS_Class_Library|Enable multi-language bindings to the C++ NEXUS Class Library project idea]].
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' The goal of this project is the development of bindings to NCL for three scripting languages (Perl, Python, Ruby) employing SWIG, the Simplified wrapper and interface generator, which is an open source tool designed to facilitate the development of extensions from C/C++ to another languages. Providing a way for rapid prototyping and easy development of applications supporting the NEXUS format.
  
'''Student:'''  
+
'''Student:''' [mailto:darw.dobz@gmail.com David Suárez Pascal]
  
'''Mentor(s):'''
+
'''Mentor:''' Mark Holder
  
'''Project homepage:'''  
+
'''Project homepage:''' [[PhyloSoC:Multi-language_bindings_to_the_NEXUS_Class_Library|Multi-language bindings to the C++ NEXUS Class Library]]
  
'''Source code:'''  
+
'''Source code:''' <!--http://ncl.svn.sourceforge.net/svnroot/ncl/-->http://svn.pascal-in.net/nclbindings (subversion repository)
  
 
===Phylogenetic XML <--> Object serialization===
 
===Phylogenetic XML <--> Object serialization===
  
This is an application for the [[#Phylogenetic_XML_.3C--.3E_Object_serialization_2|Phylogenetic XML <--> Object_serialization project idea]].  
+
This is an application for the [[#Phylogenetic_XML_.3C--.3E_Object_serialization_2|Phylogenetic XML <--> Object_serialization project idea]].
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' The goal of this project is to develop an XML based file format for use in phylogenetic analysis.  This file type will contain most of the functionality of the current NEXUS file format.  Additionally, a parser for this file type will be developed for the BioPerl package.<!-- try to limit this to about 50 words -->
  
'''Student:'''  
+
'''Student:''' [mailto:calibos@comcast.net Jason Caravas]
  
'''Mentor(s):'''
+
'''Mentor(s):''' Rutger Vos
  
'''Project homepage:'''  
+
'''Project homepage:''' [[PhyloSoC:Phylogenetic_XML|Phylogenetic XML <--> Object serialization]]
  
'''Source code:'''  
+
'''Source code:''' http://code.google.com/p/nexml07gsoc/source
  
 
===Ajax interface for the XRate command-line tool===
 
===Ajax interface for the XRate command-line tool===
Line 95: Line 90:
 
This is an application for the [[#Evolve_Unix_phyloinformatics_tools_into_Ajax_applications|Evolve Unix phyloinformatics tools into Ajax applications project idea]].  
 
This is an application for the [[#Evolve_Unix_phyloinformatics_tools_into_Ajax_applications|Evolve Unix phyloinformatics tools into Ajax applications project idea]].  
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' This project focuses on building an easy to use, visual interface for Xrate. It will bring together several tools from the dart bioinformatics package to annotate multiple sequence alignments, train phylogenetic grammars, and vizualize this information in an Ajax-enabled asynchronous web interface. As an initial application, it will be used to provide a web interface to explore and use the rfam collection of grammars.
 +
 
 +
'''Student:''' [http://biowiki.org/LarsBarquist Lars Barquist]
 +
 
 +
'''Mentor(s):''' Ian Holmes
  
'''Student:'''  
+
'''Project homepage:''' [http://biowiki.org/XreiProgram xREI biowiki page] (includes instructions for downloading the current source via cvs)
  
'''Mentor(s):'''
+
'''Project Plan:''' http://ajax-xrate.googlecode.com/files/plan.pdf
  
'''Project homepage:'''  
+
'''Source code:''' [http://google-summer-of-code-2007-nescent.googlecode.com/files/LarsEric_Barquist.tar.gz LarsEric_Barquist.tar.gz] (original google repository, currently out of date)
  
'''Source code:'''  
+
'''Demo:''' http://harmony.biowiki.org/xrei/
  
 
===Implementing a web interface for command line-based bioinformatics tools===
 
===Implementing a web interface for command line-based bioinformatics tools===
  
This application is being revised for the [[#AJAX_widgets_for_phylo-informatics|AJAX widgets for phylo-informatics project idea]].  
+
This application is being revised for the [[#AJAX_widgets_for_phylo-informatics|AJAX widgets for phylo-informatics project idea]].
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' This project aims to construct a web-based interface for the tree display and annotation feature of the xrate tool. Through the use of an easy-to-use web interface, those that are not familiar with the process of compiling their own software in an Unix environment, but are comfortable with using a web browser, will be able to use xrate's tree tools. It is hoped that such an interface will then be portable enough to be used on other similar tools to allow for easy deployment of any command line based bioinformatics utility. It uses Perl scripts to generate either .png or .svg images of trees to be displayed on a webpage. These trees are generated from either raw .tre files (Newick format) or by using special .stk files which are processed by xrate to generate trees.
  
'''Student:'''  
+
'''Student:''' [mailto:james.leung@ualberta.ca James Leung]
  
'''Mentor(s):'''
+
'''Mentor(s):''' Suzanna Lewis
  
'''Project homepage:'''  
+
'''Project homepage:''' http://code.google.com/p/xrateparser/
  
'''Source code:'''  
+
'''Source code:''' http://code.google.com/p/xrateparser/source
  
 
===Phylogenetic & haplotype displays for GBrowse===
 
===Phylogenetic & haplotype displays for GBrowse===
  
This is an application for the [[#Phylogenetic_.26_haplotype_displays_for_GBrowse_2|Phylogenetic & haplotype displays for GBrowse project idea]].  
+
This is an application for the [[#Phylogenetic_.26_haplotype_displays_for_GBrowse_2|Phylogenetic & haplotype displays for GBrowse project idea]].
 +
 
 +
'''Abstract:''' This is a project to extend the visualization for the [http://www.gmod.org/wiki/index.php/Gbrowse Genome Browser] by dividing the single large image to multiple tracks.  Phylogenetic information will also be represented as a new data track as part of this extension. <!-- try to limit this to about 50 words -->
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Student:''' [mailto:mokada23@hotmail.com Hisanaga Mark Okada]
  
'''Student:'''  
+
'''Mentor(s):''' [http://stein.cshl.edu/ Lincoln Stein]
  
'''Mentor(s):'''
+
'''Project homepage:''' [[PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse|Phylogenetic and Haplotype Displays for GBrowse]]
  
'''Project homepage:'''  
+
'''Source code:''' [http://gmod.cvs.sourceforge.net/gmod/Generic-Genome-Browser/libnew/Bio/Graphics/Glyph/phylo_align.pm?view=log phylo_align.pm in GBrowse] and [http://gmod.cvs.sourceforge.net/gmod/Generic-Genome-Browser/libnew/Bio/Graphics/Browser/ Bio::Graphics code within GMOD on SourceForge]
  
'''Source code:'''
+
===Estimation of divergence time priors from fossil occurrence data===
  
===Software development fostering the integration of molecular and paleobiological data in the estimation of species divergence times===
+
This application proposes a new project idea.
  
This application proposes a new project idea.  
+
'''Abstract:''' I will use C++ to develop a software tool that will function in the analysis of fossil occurrence data from the [http://paleodb.org/ Paleobiology Database] to calculate [http://en.wikipedia.org/wiki/Prior_probability informative priors] on divergence times, which will be implemented in the open source Bayesian molecular divergence time package [http://code.google.com/p/beast-mcmc/ BEAST].
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Student:''' [mailto:mdn3@duke.edu Michael Nowak]
  
'''Student:'''  
+
'''Mentor(s):''' Derrick Zwickl
  
'''Mentor(s):'''
+
'''Project homepage:''' [[PhyloSoC:Estimation_of_divergence_time_priors_from_fossil_occurrence_data|Estimation of divergence time priors from fossil occurrence data]]
  
'''Project homepage:'''  
+
'''Project blog:'''
 +
[http://divtimepriors.blogspot.com/ PhyloSoC: Divergence Time Priors from Fossil Occurrence Data]
  
'''Source code:'''  
+
'''Source code:''' [http://google-summer-of-code-2007-nescent.googlecode.com/files/MichaelDennis_Nowak.tar.gz MichaelDennis_Nowak.tar.gz]
  
 
===Visualizing Phylogeographic Information===
 
===Visualizing Phylogeographic Information===
Line 151: Line 153:
 
This application proposes a new project idea.  
 
This application proposes a new project idea.  
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' This project will develop a web based application that generates geographic maps of DNA haplotype data that are often used in phylogeographic analysis. The application will create maps of pie charts viewable through Google Earth that show the spatial distributions of haplotypes, the frequency of each haplotype within each population, and the number of samples per population.
  
'''Student:'''  
+
'''Student:''' [http://www.duke.edu/~yet2 Yi-Hsin Erica Tsai]
  
'''Mentor(s):'''
+
'''Mentor(s):''' [https://www.nescent.org/wg_EvoViz/User:DavidKidd David Kidd]
  
'''Project homepage:'''  
+
'''Project homepage:''' [[PhyloSoC:phylogeoviz|Visualizing Phylogeographic Information]]
 +
 +
'''Project blog:''' http://phylogeoviz.blogspot.com/
  
'''Source code:'''  
+
'''Source code:''' http://code.google.com/p/phylogeoviz/
  
 
===Biodiversity conservation algorithms and GUI===
 
===Biodiversity conservation algorithms and GUI===
Line 165: Line 169:
 
This application proposes a new project idea.  
 
This application proposes a new project idea.  
  
'''Abstract:''' <!-- try to limit this to about 50 words -->
+
'''Abstract:''' I will implement various algorithms that utilise phylogenetic information to prioritise species for biodiversity conservation. A GUI will also be developed allowing these algorithms to be utilised by conservation managers. The overall goal is to provide a package that brings together as many existing approaches as possible and provides an interface between mathematical results and their intended final audience.
 +
 
 +
'''Student:''' [http://www.kavenga.com/klaas Klaas Hartmann]
  
'''Student:'''  
+
'''Mentor(s):''' [http://www.tobias-thierer.de Tobias Thierer] (primary), [http://www.sfu.ca/~rvosa/ Rutger Vos] (secondary)
  
'''Mentor(s):'''
+
'''Project Blog:''' http://phylosoc2007khartmann.blogspot.com/
  
'''Project homepage:'''  
+
'''Project homepage:''' [[PhyloSoC:Biodiversity_Conservation_Algorithms_and_GUI|Biodiversity Conservation Algorithms and GUI]]
  
'''Source code:'''  
+
'''Source code:''' http://code.google.com/p/phylosoc2007khartmann/
  
 
==Ideas==
 
==Ideas==
Line 179: Line 185:
 
''Note: if there is more than one mentor for a project, the primary mentor is in '''bold font'''. Biographical and other information on the mentors is linked to in the [[#Mentors|Mentors section]].''
 
''Note: if there is more than one mentor for a project, the primary mentor is in '''bold font'''. Biographical and other information on the mentors is linked to in the [[#Mentors|Mentors section]].''
  
'''The student selection is now final and [http://code.google.com/soc/nescent/about.html Google has published our accepted projects], the students, and their mentors.'''
+
'''The student selection is now final and [http://code.google.com/soc/2007/nescent/about.html Google has published our accepted projects], the students, and their mentors.'''
 
   
 
   
 
<!--
 
<!--
Line 240: Line 246:
 
===Create a usable phyloinformatics API for BioJava===
 
===Create a usable phyloinformatics API for BioJava===
  
; Rationale : [http://www.biojava.org/ Biojava] has considerable sequence manipulation capability combined with distributions and good support for dynamic programming. It would be highly desirable to extend this capability to provide a phyloinformatics API.
+
; Goals : Start with a simple object model that can represent phyloinformatics objects and concepts and provide basic I/O with common formats. Build on the experimental code that is currently in org.biojavax.bio.phylo packages.
; Approach : Start with a simple object model that can represent phyloinformatics objects and concepts and provide basic I/O with common formats. Build on the experimental code that is currently in org.biojavax.bio.phylo packages.
+
; Challenges : A workable, and elegant data model with a flexible I/O that is consistent with other biojava patterns. Good documentation and Unit tests need to be built in right from the start. Concurrent example code and tutorials will be needed to maximise adoption of the API. Target JDK = 1.5.
; Challenges : A workable, and elegant data model with a flexible I/O that is consistent with other biojava patterns. Good documentation and Unit tests need to be built in right from the start. Concurrent example code and tutorials will be needed to maximise adoption of the API. Target JDK = 1.5?
 
 
; Involved toolkits or projects : [http://www.biojava.org/ biojava]
 
; Involved toolkits or projects : [http://www.biojava.org/ biojava]
; Mentors : '''Richard Holland''', Tobias Thierer, Mark Schreiber
+
; Mentors : '''[http://biojava.org/wiki/User:Rholland Richard Holland]''' (primary), [http://www.tobias-thierer.de Tobias Thierer], [http://biojava.org/wiki/User:Mark Mark Schreiber]
 +
; Student : Boh-Yun Lee
 +
; Project documentation and wiki: http://biojava.org/wiki/BioJava:PhyloSOC07 biojava.org/wiki/BioJava:PhyloSOC07]
 +
; Source code: Will be stored in the BioJava CVS under biojava-live, package org.biojavax.phylo.
  
 
===Topological query application for BioSQL===
 
===Topological query application for BioSQL===
Line 274: Line 282:
 
* [http://www.ebi.ac.uk/Information/Staff/person_maint.php?s_person_id=751 Richard Holland]
 
* [http://www.ebi.ac.uk/Information/Staff/person_maint.php?s_person_id=751 Richard Holland]
 
* [http://biowiki.org/IanHolmes Ian Holmes]
 
* [http://biowiki.org/IanHolmes Ian Holmes]
 +
* [https://www.nescent.org/wg_EvoViz/User:DavidKidd David Kidd]
 
* [[User:Hlapp|Hilmar Lapp]]
 
* [[User:Hlapp|Hilmar Lapp]]
 
* [http://www.bioontology.org/project_team.html Suzi Lewis]
 
* [http://www.bioontology.org/project_team.html Suzi Lewis]
Line 285: Line 294:
 
* [http://stein.cshl.edu Lincoln Stein]
 
* [http://stein.cshl.edu Lincoln Stein]
 
* [http://www.molevol.org/camel/ Arlin Stoltzfus]
 
* [http://www.molevol.org/camel/ Arlin Stoltzfus]
* [http://www.tobias-thierer.de/ Tobias Thierer]
+
* [[User:Tobias.thierer|Tobias Thierer]] [http://www.tobias-thierer.de/]
 
* [http://www.sfu.ca/~rvosa/ Rutger Vos]
 
* [http://www.sfu.ca/~rvosa/ Rutger Vos]
  
Line 317: Line 326:
 
* Students apply between March 14-26, 2007. The [http://code.google.com/support/bin/answer.py?answer=60279&topic=10730 eligibility requirements for students] are in the GSoC FAQ.
 
* Students apply between March 14-26, 2007. The [http://code.google.com/support/bin/answer.py?answer=60279&topic=10730 eligibility requirements for students] are in the GSoC FAQ.
 
* [http://code.google.com/support/bin/answer.py?answer=60330&topic=10728 Development occurs on-line], there is no requirement to travel.
 
* [http://code.google.com/support/bin/answer.py?answer=60330&topic=10728 Development occurs on-line], there is no requirement to travel.
 +
 +
[[Category:PhyloSoC]]
 +
[[Category:Cyberinfrastructure]]
 +
[[Category:Internships]]

Latest revision as of 21:12, 22 October 2010

We applied to the Google Summer of Code (GSoC) program for the first time in 2007. On this page we collected ideas, possible projects, prerequisites, possible solution approaches, mentors, other people or channels to contact for more information or to bounce ideas off of, etc


Why

We believe that the goals, targets, and prior work of this Phyloinformatics working group make it particularly well suited as a mentoring organization for the GSoC program, for basically three reasons.

  1. The code that students will write will facilitate new and increasingly complex questions to be asked in comparative biology, one of the central disciplines in understanding the evolution of life. As part of its agenda, the Phyloinformatics group has already collected the use-cases from the research community that represent the most common and pervasive problems in phyloinformatics. Work on the suggested or similar projects is bound to make an impact. In addition, NESCent is committed to train end-users and scientific software developers in using the resulting work through summer courses and conference tutorials.
  2. The range of problems that students can make meaningful contributions to is diverse, enabling us to accommodate different areas of interest and skills. The participating toolkits, and the GSoC project ideas we have generated, cover a variety of programming languages and tasks, yet are all directed towards the same overall goal. A diverse group of mentors is on hand and can quickly be expanded to entire developer communities of the participating projects.
  3. Aside from gaining solutions to problems in phyloinformatics, we view our GSoC participation as an opportunity to gain future contributors to reusable open-source software components in phyloinformatics from the pool of future researchers in comparative biology. Once accepted, we will advertise this program through appropriate channels to reach undergrad and grad students interested in computational comparative biology. Some of the mentors can relate particularly well to students who are novices in research programming.

Accepted Projects

Note that the accepted projects, students, and their mentors are also published by Google.

A Perl-based Command Line Interface to a Topological Query Application for BioSQL in Support of High Throughput Classification and Analysis of LTR Retrotransposons in Plant Genomes

This is an application for the Topological query application for BioSQL project idea.

Abstract: I will use PERL to create a set of command line programs for topological queries in BioSQL. The goal of this project is to create an interface that is suitable for high throughput creation and modification of SQL based phylogenies. I will use this interface to further my research on the classification of plant LTR retrotransposons.

Student: James Estill

Mentor(s): Hilmar Lapp (primary), Weigang Qiu, Bill Piel, Mike Muratet (secondary)

Project Blog: http://phylosoc2007jestill.blogspot.com/

Project homepage: Command Line Topological Query Application for BioSQL

Source code: http://code.google.com/p/phylosoc2007jestill/

Phyloinformatics Web Tools: PhyloWidget

This application proposes a new project that builds upon the Topological query application for BioSQL project idea.

Abstract: The purpose of this project is to (a) create a user-friendly and web-accessible GUI for creating phylogenetic tree queries, and (b) implement a SOAP-based client and server protocol for transmitting phylogenetic queries and results.

Student: Gregory Jordan

Mentor(s): Bill Piel (primary), Hilmar Lapp (secondary)

Project homepage: http://www.phylowidget.org/

Source code: http://code.google.com/p/phylowidget/source

APIs for BioJava

This is an application for the Create a usable phyloinformatics API for BioJava project idea.

Abstract: We are planning to develop APIs for BioJava. Especially, we will focus on the phylogeny reconstructing methods such as UPGMA, Maximum Parsimony, and Maximum Likelihood

Student: Bohyun Lee

Mentor(s): Richard Holland

Project homepage: http://biojava.org/wiki/BioJava:PhyloSOC07

Source code: org.biojavax.bio.phylo package (SVN)

Multi-language bindings to the C++ NEXUS Class Library

This is an application for the Enable multi-language bindings to the C++ NEXUS Class Library project idea.

Abstract: The goal of this project is the development of bindings to NCL for three scripting languages (Perl, Python, Ruby) employing SWIG, the Simplified wrapper and interface generator, which is an open source tool designed to facilitate the development of extensions from C/C++ to another languages. Providing a way for rapid prototyping and easy development of applications supporting the NEXUS format.

Student: David Suárez Pascal

Mentor: Mark Holder

Project homepage: Multi-language bindings to the C++ NEXUS Class Library

Source code: http://svn.pascal-in.net/nclbindings (subversion repository)

Phylogenetic XML <--> Object serialization

This is an application for the Phylogenetic XML <--> Object_serialization project idea.

Abstract: The goal of this project is to develop an XML based file format for use in phylogenetic analysis. This file type will contain most of the functionality of the current NEXUS file format. Additionally, a parser for this file type will be developed for the BioPerl package.

Student: Jason Caravas

Mentor(s): Rutger Vos

Project homepage: Phylogenetic XML <--> Object serialization

Source code: http://code.google.com/p/nexml07gsoc/source

Ajax interface for the XRate command-line tool

This is an application for the Evolve Unix phyloinformatics tools into Ajax applications project idea.

Abstract: This project focuses on building an easy to use, visual interface for Xrate. It will bring together several tools from the dart bioinformatics package to annotate multiple sequence alignments, train phylogenetic grammars, and vizualize this information in an Ajax-enabled asynchronous web interface. As an initial application, it will be used to provide a web interface to explore and use the rfam collection of grammars.

Student: Lars Barquist

Mentor(s): Ian Holmes

Project homepage: xREI biowiki page (includes instructions for downloading the current source via cvs)

Project Plan: http://ajax-xrate.googlecode.com/files/plan.pdf

Source code: LarsEric_Barquist.tar.gz (original google repository, currently out of date)

Demo: http://harmony.biowiki.org/xrei/

Implementing a web interface for command line-based bioinformatics tools

This application is being revised for the AJAX widgets for phylo-informatics project idea.

Abstract: This project aims to construct a web-based interface for the tree display and annotation feature of the xrate tool. Through the use of an easy-to-use web interface, those that are not familiar with the process of compiling their own software in an Unix environment, but are comfortable with using a web browser, will be able to use xrate's tree tools. It is hoped that such an interface will then be portable enough to be used on other similar tools to allow for easy deployment of any command line based bioinformatics utility. It uses Perl scripts to generate either .png or .svg images of trees to be displayed on a webpage. These trees are generated from either raw .tre files (Newick format) or by using special .stk files which are processed by xrate to generate trees.

Student: James Leung

Mentor(s): Suzanna Lewis

Project homepage: http://code.google.com/p/xrateparser/

Source code: http://code.google.com/p/xrateparser/source

Phylogenetic & haplotype displays for GBrowse

This is an application for the Phylogenetic & haplotype displays for GBrowse project idea.

Abstract: This is a project to extend the visualization for the Genome Browser by dividing the single large image to multiple tracks. Phylogenetic information will also be represented as a new data track as part of this extension.

Student: Hisanaga Mark Okada

Mentor(s): Lincoln Stein

Project homepage: Phylogenetic and Haplotype Displays for GBrowse

Source code: phylo_align.pm in GBrowse and Bio::Graphics code within GMOD on SourceForge

Estimation of divergence time priors from fossil occurrence data

This application proposes a new project idea.

Abstract: I will use C++ to develop a software tool that will function in the analysis of fossil occurrence data from the Paleobiology Database to calculate informative priors on divergence times, which will be implemented in the open source Bayesian molecular divergence time package BEAST.

Student: Michael Nowak

Mentor(s): Derrick Zwickl

Project homepage: Estimation of divergence time priors from fossil occurrence data

Project blog: PhyloSoC: Divergence Time Priors from Fossil Occurrence Data

Source code: MichaelDennis_Nowak.tar.gz

Visualizing Phylogeographic Information

This application proposes a new project idea.

Abstract: This project will develop a web based application that generates geographic maps of DNA haplotype data that are often used in phylogeographic analysis. The application will create maps of pie charts viewable through Google Earth that show the spatial distributions of haplotypes, the frequency of each haplotype within each population, and the number of samples per population.

Student: Yi-Hsin Erica Tsai

Mentor(s): David Kidd

Project homepage: Visualizing Phylogeographic Information

Project blog: http://phylogeoviz.blogspot.com/

Source code: http://code.google.com/p/phylogeoviz/

Biodiversity conservation algorithms and GUI

This application proposes a new project idea.

Abstract: I will implement various algorithms that utilise phylogenetic information to prioritise species for biodiversity conservation. A GUI will also be developed allowing these algorithms to be utilised by conservation managers. The overall goal is to provide a package that brings together as many existing approaches as possible and provides an interface between mathematical results and their intended final audience.

Student: Klaas Hartmann

Mentor(s): Tobias Thierer (primary), Rutger Vos (secondary)

Project Blog: http://phylosoc2007khartmann.blogspot.com/

Project homepage: Biodiversity Conservation Algorithms and GUI

Source code: http://code.google.com/p/phylosoc2007khartmann/

Ideas

Note: if there is more than one mentor for a project, the primary mentor is in bold font. Biographical and other information on the mentors is linked to in the Mentors section.

The student selection is now final and Google has published our accepted projects, the students, and their mentors.

Enable multi-language bindings to the C++ NEXUS Class Library

Rationale 
The hackathon revealed that consistent behavior from NEXUS parsers is need in all of the Bio* toolkits as well as many of the primary analysis tools in phylogenetics. Ra