Difference between revisions of "NeXML and RDF API for BioRuby"

From Phyloinformatics
Jump to: navigation, search
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
<strong>Note:</strong> This page has moved to: https://github.com/rvosa/bio-nexml/wiki/NeXML-API-for-BioRuby
The following document discusses the implementation of an NeXML parser and serializer and an RDF API for BioRuby. Note that this document is not final yet.
Currently all the parsing is done at the start( i.e. no streaming ). This is likely to change later. Parse an NeXML file:
  doc = Bio::NeXML::Parser.new( "trees.xml" )
  nexml = doc.parse
  nexml.class #Bio::NeXML::Nexml
  #get a hash of otus objects indexed with 'id'
  #get an array of otus objects
  #get an otus by id
  taxa1 = nexml.get_otus_by_id "taxa1"
  #iterate over each otus object
  nexml.each_otus do |taxa|
    puts taxa.id
    puts taxa.label
Taxa blocks and taxons are stored internally as a Ruby hash for faster 'id' based lookup. Consider this[https://www.nescent.org/wg_phyloinformatics/NeXML_Elements#Example] NeXML snippet
  #get the id of otus
  taxa1.id # "taxa1"
  #get the label of otus
  taxa1.label # "Primary taxa block"
  #get a hash of child otu objects indexed with id
  #get an array of child otu objects
  #get an otu object by id
  #get_otu_by_id is an alias of []
  t1 = taxa1[ 't1' ]
  #or iterate over each otu object
  #each_otu is an alias for each
  taxa1.each do |taxon|
    puts taxon.id
    puts taxon.label
  #or iterate with id
  taxa1.each_with_id do |id, taxon|
    puts "#{id} => #{taxon.label}"
  #check if an otu with given id belongs to an otus or not
  #include? and has? are alias for has_otu?
  taxa1.has_otu? 't2' # => true
  taxa1.has? 't8' # => false
  #an otus object in enumerable
  taxa1.map &:id # => array of otu ids
  taxa1.select {|t| t.class == "Lemurs" } #maybe in future
  #get an otu's id
  t1.id # => "t1"
  #get an otu's label
  t1.label # => "Homo sapiens"
== Trees==
Trees and tree and network are stored internally as a Ruby hash for faster 'id' based lookup.
  nexml.trees_set #return a hash of trees object indexed with 'id'
  nexml.trees #return an array of trees objects.
  #iterate over each trees object
  nexml.each_trees do |trees|
    puts trees.id
    puts trees.label
  #find a trees by id
  trees1 = nexml.get_trees_by_id 'trees1'
  trees1.class #Bio::NeXML::Trees
  #get the taxa block to which the trees is linked to
  trees1.otus #returns an otus object
  trees1.tree_set #return a hash or tree objects indexed with 'id'
  tress1.trees #return an arrayof trees object
  #iterate over each tree object
  trees1.each_tree do |t|
    puts t.id
    puts t.label
  #get a tree object with its 'tree1'
  tree1 = trees1[ 'tree1' ]
  #or, with a conventional method call
  tree1 = trees1.get_tree_by_id 'tree1'
  #or, from a nexml object
  tree1 = nexml.get_tree_by_id 'tree1'
  tree1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
  #check if a tree belongs to a trees or not
  #pass it a tree id
  tree1.has_tree? 'tree1' #return true or false
  #get the number of treess
  trees1.network_set #return a hash or network objects indexed with 'id'
  tress1.networks #return an arrayof network objects
  #iterate over each network object
  trees1.each_network do |n|
    puts n.id
    puts n.label
  #get a network object with its id
  network1 = trees1[ 'network1' ]
  #or, with a conventional method call
  network1 = trees1.get_network_by_id 'network1'
  #or, from a nexml object
  network1 = nexml.get_tree_by_id 'network1'
  network1.class #Bio::NeXML::IntTree or Bio::NeXML::FloatTree
  #check if a network belongs to a trees or not
  #pass it a network id
  trees1.has_network? 'network1' #return true or false
  #get the number of networks
Tree and Network:
  #iterate over both trees and networks
  trees1.each do |g|
    puts g.class
  #find if a tree or a network belongs to a trees or not
  #include? is an alias for has?
  trees1.has? 'tree1' #return true or false
  #total number of trees and networks
All the available methods from <code>[http://bioruby.org/rdoc/classes/Bio/Tree.html#M001688 Bio::Tree]</code> class can be called on a <code>tree</code> object.
  node1 = tree.get_node_by_name "n3" #note name is same as id
  tree1.parents node1
A <code>trees</code> object is an enumerable:
  trees1.map &:id
  nexml.characters_set #return a hash of characters object indexed with 'id'
  nexml.characters #return an array of characters object
  #iterate over each characters object
  nexml.each_characters do |ch|
    puts ch.id
    puts ch.label
  #find a characters object by id
  characters = nexml.get_characters_by_id 'chars1'
  puts characters.class
  #get the taxa block to which the characters is linked to
  characters.otus #returns an otus object
  #get the child format element
  format = characters.format
  puts format.class
  #get the child matrix element
  matrix = characters.matrix
  puts matrix.class
  format.states_set #return a hash of states objects indexed with 'id'
  format.states #return an array of states object
  #iterate over each states object
  format.each_states do |states|
    puts states.id
    puts states.label
  #get a states object by id
  states = format.get_states_by_id 'states1'
  #check if the states object with 'id' belongs to format or not
  format.has_states? 'states1'
  format.char_set #return a hash of char objects indexed with 'id'
  format.chars #return an array of char objects
  #iterate over each char object
  format.each_char do |char|
    puts char.id
    puts char.label
  #get a char object by id
  char = format.get_char_by_id 'char1'
  #check if the char object with 'id' belongs to format or not
  format.has_char? 'char1'
  #get a states or a char object by id
  state = format[ 'states1' ]
  char = format[ 'char1' ]
  #check if a states or a char object with 'id' belongs to format or not
  format.has? 'states1'
  format.has? 'char1'
  #all objects, including char and states can be iterated over with each
  format.each do |obj|
    puts obj.class
  #format is enumerable
  format.map &:id
  states.state_set #return a hash of state objects indexed with 'id'
  states.states #return an array of state objects
  #iterate over each state object
  states.each_state do |state|
    puts state.id
  #or, use its alias each
  #get a state object by id
  state = states.get_state_by_id 'state1'
  #or, use hash notation
  state = states[ 'state1' ]
  #check if a state belongs to states or not
  states.has_state? 'state1'
  #or, use its alias has? and include?
  #get the symbol associated with the state
  #find if the state is ambiguous
  #find the kind of ambiguity
  #find if it is an uncertain state set
  #find if it is a polymorphic state set
  #get the members of a state set as an array
  #or iterate over each member
  state.each do |member|
    puts member.class #same as self
    puts member.id
  #a state is Enumerable over its members
  state.select{ |member| member.id == "rna5" }
  #get the id
  #get the label
  #get the states object the char is linked to
  #get the codon position for DnaChar and RnaChar objects
[[Category:NeXML and RDF API for BioRuby]]
[[Category:NeXML and RDF API for BioRuby]]

Latest revision as of 19:06, 22 September 2011

Note: This page has moved to: https://github.com/rvosa/bio-nexml/wiki/NeXML-API-for-BioRuby