Difference between revisions of "Project Plan for NeXML and RDF API in BioRuby"

From Phyloinformatics
Jump to: navigation, search
Line 1: Line 1:
 
== Week 1 ( May 24 - May 30 ) ==
 
== Week 1 ( May 24 - May 30 ) ==
 +
Planned:
 
I will start the development of the NeXML parser this week. The parser should be able to accept NeXML in a couple of ways: file, io, string, uri. The target is, to be able to parse <code>otus</code>, <code>otu</code>, and simple <code>trees</code>( just a tree with some nodes and edges ), and <code>class</code>. Focus will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.
 
I will start the development of the NeXML parser this week. The parser should be able to accept NeXML in a couple of ways: file, io, string, uri. The target is, to be able to parse <code>otus</code>, <code>otu</code>, and simple <code>trees</code>( just a tree with some nodes and edges ), and <code>class</code>. Focus will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.
 +
 +
Notes:
 +
Since the support for <code>class</code> elements is not complete in the schema, it will not be implemented in the parser.
  
 
== Week 2 ( May 31 - June 6 ) ==
 
== Week 2 ( May 31 - June 6 ) ==
The target of this week is to be able to completely implement <code>trees</code> and <code>networks</code> including both their int and float variants. This week I will also start working on <code>characters</code> element. The parser should be able to work with character state matrices save the types. Again, the focus this week will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.
+
This week I will completely implement <code>trees</code> and <code>networks</code> including both their int and float variants with documentation.  
 +
I will start working on <code>characters</code> element as well. NeXML allows for two broad categories of data( sequence and granular observation ), each with six sub categories. Without keeping the type in mind the parser should be able to recognize: state definitions without ambiguity( <code>state</code> ), character definitions( <code>char</code> ), matrix rows with raw character sequences (<code>seq</code>) and granular observations( <code>obs</code> ). In parsing <code>characters</code> the focus will be on designing classes to abstract <code>characters</code> and its child elements, the actual parsing and unit tests and not on documentation.
  
 
== Week 3 ( June 7 - June 13 ) ==
 
== Week 3 ( June 7 - June 13 ) ==
NeXML allows for two broad categories of data( sequence and granular observation ), each with six sub categories.
 
 
* Design classes to encapsulate, and parse and return corresponding objects for:
 
* Design classes to encapsulate, and parse and return corresponding objects for:
 
** Character block( <code>characters</code> )
 
** Character block( <code>characters</code> )

Revision as of 13:54, 1 June 2010

Week 1 ( May 24 - May 30 )

Planned: I will start the development of the NeXML parser this week. The parser should be able to accept NeXML in a couple of ways: file, io, string, uri. The target is, to be able to parse otus, otu, and simple trees( just a tree with some nodes and edges ), and class. Focus will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.

Notes: Since the support for class elements is not complete in the schema, it will not be implemented in the parser.

Week 2 ( May 31 - June 6 )

This week I will completely implement trees and networks including both their int and float variants with documentation. I will start working on characters element as well. NeXML allows for two broad categories of data( sequence and granular observation ), each with six sub categories. Without keeping the type in mind the parser should be able to recognize: state definitions without ambiguity( state ), character definitions( char ), matrix rows with raw character sequences (seq) and granular observations( obs ). In parsing characters the focus will be on designing classes to abstract characters and its child elements, the actual parsing and unit tests and not on documentation.

Week 3 ( June 7 - June 13 )

  • Design classes to encapsulate, and parse and return corresponding objects for:
    • Character block( characters )
    • format, states, state, char
    • matrix, row

Week 4 ( June 14 - June 20 )

Make sure that the API for the parser is in place, with software development iterations, tests and documents.

Week 5 ( June 21 - June 27 )

Development of the NeXML serializer

  • Extend the already designed classes to serialize:
    • Taxa( otu )
    • Taxa block( otus ) and
    • Sets( class )

Week 6 ( June 28 - July 4 )

  • Extend the already designed classes to serialize:
    • Trees( trees )
    • Tree( tree ), Network( network ), Node( node ) and Edge( edge )

Week 7 ( July 5 - July 11 )

  • Extend the already designed classes to serialize:
    • Character block( characters )
    • format, states, state, char
    • matrix, row

Week 8 ( July 12 - July 18 )

Make sure that the API for the NeXML serializer is in place, with software development iterations, tests and documents.

Week 9 ( July 19 - July 25 )

Design classes for semantic annotation in BioRuby.

Week 10 ( July 26 - August 1 )

  • Parse meta NeXML element and return the corresponding object.
  • Serialize annotations into meta tag.

Week 11 ( August 2 - August 8 )

Make sure that the RDF API is in place, with software development iterations, tests and documents.

Week 12 ( August 9 - August 15 )

Tests and documentations.

References

A discussion on API can be found here - NeXML and RDF API for BioRuby