NeXML and RDF API for BioRuby

From Phyloinformatics
Revision as of 15:17, 9 June 2010 by Yeban (talk) (Trees and Tree)
Jump to: navigation, search

Preface

The following document discusses the implementation of an NeXML parser and serializer and an RDF API for BioRuby. Note that this document is not final yet.

Parsing

Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:

 doc = Bio::NeXML::Parser.new( "trees.xml" )
 nexml = doc.parse
 nexml.class #Bio::NeXML::Nexml

Otus and Otu

Taxa blocks and taxons are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.otus_set #a hash of otus objects indexed with 'id'
 nexml.otus #an array of otus objects
 #iterate over each otus object
 nexml.each_otus do |taxa|
   puts taxa.id
   puts taxa.label
 end
 #find an otus by id
 taxa1 = nexml.get_otus_by_id "taxa1"
 taxa1.class #Bio::NeXML::Otus

 taxa1.otu_set #a hash of otu objects indexed with 'id
 taxa1.otus #an array of otu objects
 #get an individual otu object given its id
 taxon1 = taxa1[ 'taxon1' ]
 #or, a conventional method call
 taxon1 = taxa1.get_otu_by_id 'taxon1'
 #or, get an otu from nexml object
 taxon1 = nexml.get_otu_by_id 'taxon1'
 #or iterate over each otu object
 #each_otu is an alias for each
 taxa1.each do |taxon|
   puts taxon.id
   puts taxon.label
 end
 #or iterate with id
 taxa1.each_with_id do |id, taxon|
   puts "#{id} => #{taxon.label}"
 end
 #check if an otu belongs to an otus or not
 #pass it an otu id
 #include? is an alias for has_otu?
 taxa1.has_otu? 'taxon1' # => true or false

Each otus object is an enumerable. This functionality could specially be useful with the support for class element.

 taxa1.map &:id
 taxa1.select {|t| t.class == "Lemurs" } #maybe in future

Trees and Tree

Trees and tree and network are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.trees_set #return a hash of trees object indexed with 'id'
 nexml.trees #return an array of trees objects.
 #iterate over each trees object
 nexml.each_trees do |trees|
   puts trees.id
   puts trees.label
 end
 #find a trees by id
 trees1 = nexml.get_trees_by_id 'trees1'
 trees1.class #Bio::NeXML::Trees
 #get the taxa block to which the trees is linked to
 trees1.otus #returns an otus object

 trees1.tree_set #return a hash or tree objects indexed with 'id'
 tress1.trees #return an arrayof trees object
 #iterate over each tree object
 trees1.each_tree do |t|
   puts t.id
   puts t.label
 end
 #get a tree object with its 'tree1'
 tree1 = trees1[ 'tree1' ]
 #or, with a conventional method call
 tree1 = trees1.get_tree_by_id 'tree1'
 #or, from a nexml object
 tree1 = nexml.get_tree_by_id 'tree1'
 tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 #check if a tree belongs to a trees or not
 #pass it a tree id
 tree1.has_tree? 'tree1' #return true or false

 trees1.network_set #return a hash or network objects indexed with 'id'
 tress1.networks #return an arrayof network objects
 #iterate over each network object
 trees1.each_network do |n|
   puts n.id
   puts n.label
 end
 #get a network object with its id
 network1 = trees1[ 'network1' ]
 #or, with a conventional method call
 network1 = trees1.get_network_by_id 'tree1'
 #or, from a nexml object
 network1 = nexml.get_network_by_id 'tree1'
 network1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 #check if a network belongs to a trees or not
 #pass it a network id
 tree1.has_network? 'network1' #return true or false

 #iterate over both trees and networks
 trees1.each do |g|
   puts g.class
 end
 #find if a tree or a network belongs to a trees or not
 #include? is an alias for has?
 trees1.has? 'tree1' #return true or false

All the available methods from Bio::Tree class can be called on a tree object.

 node1 = tree.get_node_by_name "n3" #note name is same as id
 tree1.parents node1

A trees object is an enumerable:

 trees1.map &:id