Difference between revisions of "NeXML and RDF API for BioRuby"

From Phyloinformatics
Jump to: navigation, search
(Otus and Otu)
Line 24: Line 24:
 
   #find an otus by id
 
   #find an otus by id
 
   taxa1 = nexml.get_otus_by_id "taxa1"
 
   taxa1 = nexml.get_otus_by_id "taxa1"
   #or
+
    
  taxa1 = nexml.otus_set[ "taxa1" ]
 
 
 
 
   taxa1.class #Bio::NeXML::Otus
 
   taxa1.class #Bio::NeXML::Otus
 
</ruby>
 
</ruby>

Revision as of 14:27, 9 June 2010

Preface

The following document discusses the implementation of an NeXML parser and serializer and an RDF API for BioRuby. Note that this document is not final yet.

Parsing

Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:

 doc = Bio::NeXML::Parser.new( "trees.xml" )
 nexml = doc.parse
 nexml.class #Bio::NeXML::Nexml

Otus and Otu

Taxa blocks and taxons are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.otus_set #a hash of otus objects indexed with 'id'
 nexml.otus #an array of otus objects
 #iterate over each otus object
 nexml.each_otus do |taxa|
   puts taxa.id
   puts taxa.label
 end
 #find an otus by id
 taxa1 = nexml.get_otus_by_id "taxa1"
 
 taxa1.class #Bio::NeXML::Otus

 taxa1.otu_set #a hash of otu objects indexed with 'id
 taxa1.otus #an array of otu objects
 #get an individual otu object given its id
 taxon1 = taxa1[ 'taxon1' ]
 #or, a conventional method call
 taxon1 = taxa1.get_otu_by_id 'taxon1'
 #or, get an otu from nexml object
 taxon1 = nexml.get_otu_by_id 'taxon1'
 #or iterate over each otu object
 #each_otu is an alias for each
 taxa1.each do |taxon|
   puts taxon.id
   puts taxon.label
 end
 #or iterate with id
 taxa1.each_with_id do |id, taxon|
   puts "#{id} => #{taxon.label}"
 end

Each otus object is an enumerable. This functionality could specially be useful with the support for class element.

 taxa1.map &:id
 taxa1.select {|t| t.class == "Lemurs" } #maybe in future

Trees and Tree

Get a trees object:

 nexml.trees #return an array of trees objects.
 trees1 = nexml.trees[0]
 trees1.class #Bio::NeXML::Trees
 #get the taxa to which the trees is linked to
 trees1.otus

Currently a tree can have only one root node. To work with an individual tree :

 #get a tree object with its 'id'
 tree1 = trees1[ 'tree1' ]
 tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 #or iterate over each tree object
 trees1.each do |tree|
   puts tree.id
   puts tree.label
 end

All the available methods from Bio::Tree class can be called on a tree object.

 node1 = tree.get_node_by_name "n3" #note name is same as id
 tree1.parents node1

A trees object is an enumerable:

 trees1.map &:id