Difference between revisions of "NeXML and RDF API for BioRuby"

From Phyloinformatics
Jump to: navigation, search
Line 3: Line 3:
  
 
==Parsing==
 
==Parsing==
Parse an NeXML file.
+
Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:
  
 
   doc = Bio::NeXML::Parser.new( "trees.xml" )
 
   doc = Bio::NeXML::Parser.new( "trees.xml" )
Line 9: Line 9:
 
   nexml.class #Bio::NeXML::Nexml
 
   nexml.class #Bio::NeXML::Nexml
  
Read taxa blocks by calling <code>otus</code> method on an <code>nexml</code> object. This returns an array of <code>otus</code> objects.
+
==Otus and Otu==
 +
Taxa blocks are stored internally as a Ruby hash for faster 'id' based lookup.
  
   nexml.otus.each do |taxa|
+
   nexml.otus_set #a hash of otus objects indexed with 'id'
 +
  nexml.otus #an array of otus objects
 +
 
 +
  #iterate over each otus object
 +
  nexml.each_otus do |taxa|
 
     puts taxa.id
 
     puts taxa.id
 
     puts taxa.label
 
     puts taxa.label
 
   end
 
   end
  
An array of <code>otu</code> objects can be obtained by calling <code>otu</code> method on an <code>otus</code> object.
+
  #find an otus by id
 +
  taxa1 = nexml.get_otus_by_id "taxa1"
 +
  taxa1.class #Bio::NeXML::Otus
 +
 
 +
Similarly taxons are stored internally as a Ruby hash indexed with 'id'. To work with <code>otu</code> :
 +
 
 +
  taxa1.otu_set #a hash of otu objects indexed with 'id
 +
  taxa1.otus #an array of otu objects
 +
 
 +
  #get an individual otu object given its id
 +
  taxon1 = taxa1[ 'taxon1' ]
  
   taxa1 = nexml.otus.first
+
   #or iterate over each otu object
  taxa1.class #Bio::NeXML::Otus
+
   taxa1.each do |taxon|
   taxa1.otu.each do |taxon|
 
 
     puts taxon.id
 
     puts taxon.id
 
     puts taxon.label
 
     puts taxon.label
 
   end
 
   end
 +
 
 +
Each <code>otus</code> object is an enumerable:
 +
  taxa1.map &:id
  
Trees are accessible by calling <code>trees</code> method on an <code>nexml</code> object, which returns an array of <code>trees</code> objects.
+
== Trees and Tree ==
 
+
Get a <code>trees</code> object:
  nexml.trees.each do |trees|
 
    puts trees.id
 
  end
 
 
 
The taxa block to which a <code>trees</code> object is linked to can be obtained.
 
  
   trees1 = nexml.trees.first
+
  nexml.trees #return an array of trees objects.
 +
   trees1 = nexml.trees[0]
 
   trees1.class #Bio::NeXML::Trees
 
   trees1.class #Bio::NeXML::Trees
 +
 +
  #get the taxa to which the trees is linked to
 
   trees1.otus
 
   trees1.otus
  
All the <code>tree</code> objects under a <code>trees</code> element are obtainable.
+
Currently a <code>tree</code> can have only one root node. To work with an individual <code>tree</code> :
 
+
 
   trees1.tree.each do |tree|
+
  #get a tree object with its 'id'
 +
   tree1 = trees1[ 'tree1' ]
 +
  tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 +
 
 +
  #or iterate over each tree object
 +
  trees1.each do |tree|
 
     puts tree.id
 
     puts tree.id
 +
    puts tree.label
 
   end
 
   end
  
All the available instance methods from <code>[http://bioruby.org/rdoc/classes/Bio/Tree.html#M001688 Bio::Tree]</code> class can be called on a <code>tree</code> object.
+
All the available methods from <code>[http://bioruby.org/rdoc/classes/Bio/Tree.html#M001688 Bio::Tree]</code> class can be called on a <code>tree</code> object.
  tree1 = trees1.tree.first
+
 
  tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 
 
   node1 = tree.get_node_by_name "n3" #note name is same as id
 
   node1 = tree.get_node_by_name "n3" #note name is same as id
 
   tree1.parents node1
 
   tree1.parents node1
 +
 +
A <code>trees</code> object is an enumerable:
 +
  trees1.map &:id
  
 
[[Category:NeXML and RDF API for BioRuby]]
 
[[Category:NeXML and RDF API for BioRuby]]

Revision as of 04:39, 3 June 2010

Preface

The following document discusses the implementation of an NeXML parser and serializer and an RDF API for BioRuby. Note that this document is not final yet.

Parsing

Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:

 doc = Bio::NeXML::Parser.new( "trees.xml" )
 nexml = doc.parse
 nexml.class #Bio::NeXML::Nexml

Otus and Otu

Taxa blocks are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.otus_set #a hash of otus objects indexed with 'id'
 nexml.otus #an array of otus objects
 
 #iterate over each otus object
 nexml.each_otus do |taxa|
   puts taxa.id
   puts taxa.label
 end
 #find an otus by id
 taxa1 = nexml.get_otus_by_id "taxa1"
 taxa1.class #Bio::NeXML::Otus

Similarly taxons are stored internally as a Ruby hash indexed with 'id'. To work with otu :

 taxa1.otu_set #a hash of otu objects indexed with 'id
 taxa1.otus #an array of otu objects
 #get an individual otu object given its id
 taxon1 = taxa1[ 'taxon1' ]
 #or iterate over each otu object
 taxa1.each do |taxon|
   puts taxon.id
   puts taxon.label
 end
 

Each otus object is an enumerable:

 taxa1.map &:id

Trees and Tree

Get a trees object:

 nexml.trees #return an array of trees objects.
 trees1 = nexml.trees[0]
 trees1.class #Bio::NeXML::Trees

 #get the taxa to which the trees is linked to
 trees1.otus

Currently a tree can have only one root node. To work with an individual tree :

 #get a tree object with its 'id'
 tree1 = trees1[ 'tree1' ]
 tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 
 #or iterate over each tree object
 trees1.each do |tree|
   puts tree.id
   puts tree.label
 end

All the available methods from Bio::Tree class can be called on a tree object.

 node1 = tree.get_node_by_name "n3" #note name is same as id
 tree1.parents node1

A trees object is an enumerable:

 trees1.map &:id