Difference between revisions of "NeXML and RDF API for BioRuby"

From Phyloinformatics
Jump to: navigation, search
(Trees and Tree)
Line 66: Line 66:
  
 
== Trees and Tree ==
 
== Trees and Tree ==
Get a <code>trees</code> object:
+
Trees and tree and network are stored internally as a Ruby hash for faster 'id' based lookup.
 
<ruby>
 
<ruby>
 +
  nexml.trees_set #return a hash of trees object indexed with 'id'
 
   nexml.trees #return an array of trees objects.
 
   nexml.trees #return an array of trees objects.
   trees1 = nexml.trees[0]
+
 
 +
  #iterate over each trees object
 +
  nexml.each_trees do |trees|
 +
    puts trees.id
 +
    puts trees.label
 +
  end
 +
 
 +
  #find a trees by id
 +
   trees1 = nexml.get_trees_by_id 'trees1'
 +
 
 
   trees1.class #Bio::NeXML::Trees
 
   trees1.class #Bio::NeXML::Trees
  
   #get the taxa to which the trees is linked to
+
   #get the taxa block to which the trees is linked to
   trees1.otus
+
   trees1.otus #returns an otus object
 
</ruby>
 
</ruby>
Currently a <code>tree</code> can have only one root node. To work with an individual <code>tree</code> :
+
 
 
<ruby>
 
<ruby>
   #get a tree object with its 'id'
+
  trees1.tree_set #return a hash or tree objects indexed with 'id'
 +
  tress1.trees #return an arrayof trees object
 +
 
 +
  #iterate over each tree object
 +
  trees1.each_tree do |t|
 +
    puts t.id
 +
    puts t.label
 +
  end
 +
 
 +
   #get a tree object with its 'tree1'
 
   tree1 = trees1[ 'tree1' ]
 
   tree1 = trees1[ 'tree1' ]
 +
  #or, with a conventional method call
 +
  tree1 = trees1.get_tree_by_id 'tree1'
 +
  #or, from a nexml object
 +
  tree1 = nexml.get_tree_by_id 'tree1'
 +
 
   tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 
   tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
  
   #or iterate over each tree object
+
   #check if a tree belongs to a trees or not
   trees1.each do |tree|
+
   #pass it a tree id
    puts tree.id
+
  tree1.has_tree? 'tree1' #return true or false
    puts tree.label
+
    
   end
 
 
</ruby>
 
</ruby>
 
All the available methods from <code>[http://bioruby.org/rdoc/classes/Bio/Tree.html#M001688 Bio::Tree]</code> class can be called on a <code>tree</code> object.
 
All the available methods from <code>[http://bioruby.org/rdoc/classes/Bio/Tree.html#M001688 Bio::Tree]</code> class can be called on a <code>tree</code> object.

Revision as of 15:07, 9 June 2010

Preface

The following document discusses the implementation of an NeXML parser and serializer and an RDF API for BioRuby. Note that this document is not final yet.

Parsing

Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:

 doc = Bio::NeXML::Parser.new( "trees.xml" )
 nexml = doc.parse
 nexml.class #Bio::NeXML::Nexml

Otus and Otu

Taxa blocks and taxons are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.otus_set #a hash of otus objects indexed with 'id'
 nexml.otus #an array of otus objects
 #iterate over each otus object
 nexml.each_otus do |taxa|
   puts taxa.id
   puts taxa.label
 end
 #find an otus by id
 taxa1 = nexml.get_otus_by_id "taxa1"
 taxa1.class #Bio::NeXML::Otus

 taxa1.otu_set #a hash of otu objects indexed with 'id
 taxa1.otus #an array of otu objects
 #get an individual otu object given its id
 taxon1 = taxa1[ 'taxon1' ]
 #or, a conventional method call
 taxon1 = taxa1.get_otu_by_id 'taxon1'
 #or, get an otu from nexml object
 taxon1 = nexml.get_otu_by_id 'taxon1'
 #or iterate over each otu object
 #each_otu is an alias for each
 taxa1.each do |taxon|
   puts taxon.id
   puts taxon.label
 end
 #or iterate with id
 taxa1.each_with_id do |id, taxon|
   puts "#{id} => #{taxon.label}"
 end
 #check if an otu belongs to an otus or not
 #pass it an otu id
 taxa1.has_otu? 'taxon1' # => true or false
 #or pass an otu object
 otu = Bio::NeXML::Otu 't1'
 taxa1.include? otu   #include? is an alias for has_otu?

Each otus object is an enumerable. This functionality could specially be useful with the support for class element.

 taxa1.map &:id
 taxa1.select {|t| t.class == "Lemurs" } #maybe in future

Trees and Tree

Trees and tree and network are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.trees_set #return a hash of trees object indexed with 'id'
 nexml.trees #return an array of trees objects.
 #iterate over each trees object
 nexml.each_trees do |trees|
   puts trees.id
   puts trees.label
 end
 #find a trees by id
 trees1 = nexml.get_trees_by_id 'trees1'
 trees1.class #Bio::NeXML::Trees
 #get the taxa block to which the trees is linked to
 trees1.otus #returns an otus object

 trees1.tree_set #return a hash or tree objects indexed with 'id'
 tress1.trees #return an arrayof trees object
 #iterate over each tree object
 trees1.each_tree do |t|
   puts t.id
   puts t.label
 end
 #get a tree object with its 'tree1'
 tree1 = trees1[ 'tree1' ]
 #or, with a conventional method call
 tree1 = trees1.get_tree_by_id 'tree1'
 #or, from a nexml object
 tree1 = nexml.get_tree_by_id 'tree1'
 tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 #check if a tree belongs to a trees or not
 #pass it a tree id
 tree1.has_tree? 'tree1' #return true or false
 

All the available methods from Bio::Tree class can be called on a tree object.

 node1 = tree.get_node_by_name "n3" #note name is same as id
 tree1.parents node1

A trees object is an enumerable:

 trees1.map &:id