Difference between revisions of "PhyloSoC:BioPerl integration of the NeXML exchange standard and Bio::Phylo toolkit/nexml module design"

From Phyloinformatics
Jump to: navigation, search
m (New page: == Nexml Item Module Design == Currently, parsing a nexml file into Bioperl makes use of the *IO modules (AlignIO, SeqIO, and TreeIO) to load nexml represented data into bioperl objects. ...)
 
m (Nexml Item Module Design)
Line 2: Line 2:
  
 
Currently, parsing a nexml file into Bioperl makes use of the *IO modules (AlignIO, SeqIO, and TreeIO) to load nexml represented data into bioperl objects.  These modules will handle each data type separately, but the ultimate goal is to make this easier on the user and allow the full parsing of a nexml file with one pass of the Bio::Phylo parser without forcing the user to parse the file three times - once each for sequences, alignments, and trees. To accomplish this, a Bio::nexml module will be created that will handle an entire nexml file including the three different data types it contains.  A powerful feature of nexml is it's ability to relate data to each other in relevant ways. This module will also handle (as much as possible) the relationships between the different data types.
 
Currently, parsing a nexml file into Bioperl makes use of the *IO modules (AlignIO, SeqIO, and TreeIO) to load nexml represented data into bioperl objects.  These modules will handle each data type separately, but the ultimate goal is to make this easier on the user and allow the full parsing of a nexml file with one pass of the Bio::Phylo parser without forcing the user to parse the file three times - once each for sequences, alignments, and trees. To accomplish this, a Bio::nexml module will be created that will handle an entire nexml file including the three different data types it contains.  A powerful feature of nexml is it's ability to relate data to each other in relevant ways. This module will also handle (as much as possible) the relationships between the different data types.
 +
 +
=== Data Relationships to Maintain ===
 +
*Sequences <=> Nodes
 +
** How to do this?
 +
*Alignments <=> Trees
 +
** How to do this?
 +
*Taxa <=> Trees
 +
** How to do this?
 +
*Taxa <=> Alignments
 +
** How to do this?
 +
*Taxon <=> Node
 +
** How to do this?
 +
 +
=== This design plan is a work in progress ===

Revision as of 10:31, 8 June 2009

Nexml Item Module Design

Currently, parsing a nexml file into Bioperl makes use of the *IO modules (AlignIO, SeqIO, and TreeIO) to load nexml represented data into bioperl objects. These modules will handle each data type separately, but the ultimate goal is to make this easier on the user and allow the full parsing of a nexml file with one pass of the Bio::Phylo parser without forcing the user to parse the file three times - once each for sequences, alignments, and trees. To accomplish this, a Bio::nexml module will be created that will handle an entire nexml file including the three different data types it contains. A powerful feature of nexml is it's ability to relate data to each other in relevant ways. This module will also handle (as much as possible) the relationships between the different data types.

Data Relationships to Maintain

  • Sequences <=> Nodes
    • How to do this?
  • Alignments <=> Trees
    • How to do this?
  • Taxa <=> Trees
    • How to do this?
  • Taxa <=> Alignments
    • How to do this?
  • Taxon <=> Node
    • How to do this?

This design plan is a work in progress