Difference between revisions of "NeXML and RDF API for BioRuby"

From Phyloinformatics
Jump to: navigation, search
(Char)
(State)
Line 271: Line 271:
 
===State===
 
===State===
 
<ruby>
 
<ruby>
 +
  #get the symbol associated with the state
 +
  state.symbol
 +
 
 +
  #find if the state is ambiguous
 +
  state.ambiguous?
 +
 
 +
  #find the kind of ambiguity
 +
  state.ambiguity
 +
 
 +
  #find if it is an uncertain state set
 +
  state.uncertain?
 +
 +
  #find if it is a polymorphic state set
 +
  state.polymorphic?
 +
 
 +
  #get the members of a state set
 +
  state.members # array
 
</ruby>
 
</ruby>
 +
 
===Matrix===
 
===Matrix===
 
<ruby>
 
<ruby>

Revision as of 13:15, 19 June 2010

Preface

The following document discusses the implementation of an NeXML parser and serializer and an RDF API for BioRuby. Note that this document is not final yet.

Parsing

Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:

 doc = Bio::NeXML::Parser.new( "trees.xml" )
 nexml = doc.parse
 nexml.class #Bio::NeXML::Nexml

Otus and Otu

Taxa blocks and taxons are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.otus_set #a hash of otus objects indexed with 'id'
 nexml.otus #an array of otus objects
 #iterate over each otus object
 nexml.each_otus do |taxa|
   puts taxa.id
   puts taxa.label
 end
 #find an otus by id
 taxa1 = nexml.get_otus_by_id "taxa1"
 taxa1.class #Bio::NeXML::Otus

Otu:

 taxa1.otu_set #a hash of otu objects indexed with 'id
 taxa1.otus #an array of otu objects
 #get an individual otu object given its id
 taxon1 = taxa1[ 'taxon1' ]
 #or, a conventional method call
 taxon1 = taxa1.get_otu_by_id 'taxon1'
 #or, get an otu from nexml object
 taxon1 = nexml.get_otu_by_id 'taxon1'
 #or iterate over each otu object
 #each_otu is an alias for each
 taxa1.each do |taxon|
   puts taxon.id
   puts taxon.label
 end
 #or iterate with id
 taxa1.each_with_id do |id, taxon|
   puts "#{id} => #{taxon.label}"
 end
 #check if an otu belongs to an otus or not
 #pass it an otu id
 #include? is an alias for has_otu?
 taxa1.has_otu? 'taxon1' # => true or false

Each otus object is an enumerable. This functionality could specially be useful with the support for class element.

 taxa1.map &:id
 taxa1.select {|t| t.class == "Lemurs" } #maybe in future

Trees and Tree

Trees and tree and network are stored internally as a Ruby hash for faster 'id' based lookup.

 nexml.trees_set #return a hash of trees object indexed with 'id'
 nexml.trees #return an array of trees objects.
 #iterate over each trees object
 nexml.each_trees do |trees|
   puts trees.id
   puts trees.label
 end
 #find a trees by id
 trees1 = nexml.get_trees_by_id 'trees1'
 trees1.class #Bio::NeXML::Trees
 #get the taxa block to which the trees is linked to
 trees1.otus #returns an otus object

Tree:

 trees1.tree_set #return a hash or tree objects indexed with 'id'
 tress1.trees #return an arrayof trees object
 #iterate over each tree object
 trees1.each_tree do |t|
   puts t.id
   puts t.label
 end
 #get a tree object with its 'tree1'
 tree1 = trees1[ 'tree1' ]
 #or, with a conventional method call
 tree1 = trees1.get_tree_by_id 'tree1'
 #or, from a nexml object
 tree1 = nexml.get_tree_by_id 'tree1'
 tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 #check if a tree belongs to a trees or not
 #pass it a tree id
 tree1.has_tree? 'tree1' #return true or false
 #get the number of treess
 trees1.number_of_trees

Network:

 trees1.network_set #return a hash or network objects indexed with 'id'
 tress1.networks #return an arrayof network objects
 #iterate over each network object
 trees1.each_network do |n|
   puts n.id
   puts n.label
 end
 #get a network object with its id
 network1 = trees1[ 'network1' ]
 #or, with a conventional method call
 network1 = trees1.get_network_by_id 'network1'
 #or, from a nexml object
 network1 = nexml.get_tree_by_id 'network1'
 network1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
 #check if a network belongs to a trees or not
 #pass it a network id
 trees1.has_network? 'network1' #return true or false
 #get the number of networks
 trees1.number_of_networks

Tree and Network:

 #iterate over both trees and networks
 trees1.each do |g|
   puts g.class
 end
 #find if a tree or a network belongs to a trees or not
 #include? is an alias for has?
 trees1.has? 'tree1' #return true or false
 #total number of trees and networks
 trees1.number_of_graphs

All the available methods from Bio::Tree class can be called on a tree object.

 node1 = tree.get_node_by_name "n3" #note name is same as id
 tree1.parents node1

A trees object is an enumerable:

 trees1.map &:id

Characters

 nexml.characters_set #return a hash of characters object indexed with 'id'
 nexml.characters #return an array of characters object
 #iterate over each characters object
 nexml.each_characters do |ch|
   puts ch.id
   puts ch.label
 end
 #find a characters object by id
 characters = nexml.get_characters_by_id 'chars1'
 puts characters.class
 #get the taxa block to which the characters is linked to
 characters.otus #returns an otus object
 #get the child format element
 format = characters.format
 puts format.class
 #get the child matrix element
 matrix = characters.matrix
 puts matrix.class

Format

 format.states_set #return a hash of states objects indexed with 'id'
 format.states #return an array of states object
 #iterate over each states object
 format.each_states do |states|
   puts states.id
   puts states.label
 end
 #get a states object by id
 states = format.get_states_by_id 'states1'
 
 #check if the states object with 'id' belongs to format or not
 format.has_states? 'states1'
 format.char_set #return a hash of char objects indexed with 'id'
 format.chars #return an array of char objects
 #iterate over each char object
 format.each_char do |char|
   puts char.id
   puts char.label
 end
 #get a char object by id
 char = format.get_char_by_id 'char1'
 #check if the char object with 'id' belongs to format or not
 format.has_char? 'char1'
 #get a states or a char object by id
 state = format[ 'states1' ]
 char = format[ 'char1' ]
 #check if a states or a char object with 'id' belongs to format or not
 format.has? 'states1'
 format.has? 'char1'
 #all objects, including char and states can be iterated over with each
 format.each do |obj|
   puts obj.class
 end
 #format is enumerable
 format.map &:id

States

 states.state_set #return a hash of state objects indexed with 'id'
 states.states #return an array of state objects
 #iterate over each state object
 states.each_state do |state|
   puts state.id
 end
 #or, use its alias each
 #get a state object by id
 state = states.get_state_by_id 'state1'
 #or, use hash notation
 state = states[ 'state1' ]
 #check if a state belongs to states or not
 states.has_state? 'state1'
 #or, use its alias has? and include?
 

Char

 #get the states object the char is linked to
 char.states

State

 #get the symbol associated with the state
 state.symbol
 
 #find if the state is ambiguous
 state.ambiguous?
 
 #find the kind of ambiguity
 state.ambiguity
 
 #find if it is an uncertain state set
 state.uncertain?
 #find if it is a polymorphic state set
 state.polymorphic?
 
 #get the members of a state set
 state.members # array

Matrix