BioRuby PhyloXML HowTo documentation

From Phyloinformatics
Revision as of 17:28, 13 August 2009 by Dianaj (talk) (How to parse a file)
Jump to: navigation, search


PhyloXML is an XML language for saving, analyzing and exchanging data of annotated phylogenetic trees. PhyloXML parser in BioRuby is implemented in Bio::PhyloXML::Parser and writer in Bio::PhyloXML::Writer. More information at


In addition to BioRuby library you need a libxml ruby bindings. To install:

gem install -r libxml-ruby

How to parse a file

require 'bio'
# Create new phyloxml parser
phyloxml ='example.xml')
# Print the names of all trees in the file
phyloxml.each do |tree|

If there are several trees in the file, you can access the one you wish by an index

tree = phyloxml[3]

You can use all Bio::Tree methods on the tree. For example,

tree.leaves.each do |node|

PhyloXML files can hold additional information besides phylogenies at the end of the file. This info can be accessed through the 'other' array of the parser object.

phyloxml ='example.xml')
while tree = phyloxml.next_tree
  # do stuff with trees
puts phyloxml.other

How to write a file

# Create new phyloxml writer
writer ='tree.xml')
# Write tree to the file tree.xml
# Add another tree to the file

How to retrieve data

Here is an example of how to retrieve the scientific name of the clades.

require 'bio'
phyloxml ='ncbi_taxonomy_mollusca.xml')
phyloxml.each do |tree|
  tree.each_node do |node|
    print "Scientific name: ", node.taxonomies[0].scientific_name, "\n"