Difference between revisions of "Project Plan for NeXML and RDF API in BioRuby"

From Phyloinformatics
Jump to: navigation, search
Line 1: Line 1:
 
== Week 1 ( May 24 - May 30 ) ==
 
== Week 1 ( May 24 - May 30 ) ==
Planned:
+
Planned:<br/>
 
I will start the development of the NeXML parser this week. The parser should be able to accept NeXML in a couple of ways: file, io, string, uri. The target is, to be able to parse <code>otus</code>, <code>otu</code>, and simple <code>trees</code>( just a tree with some nodes and edges ), and <code>class</code>. Focus will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.
 
I will start the development of the NeXML parser this week. The parser should be able to accept NeXML in a couple of ways: file, io, string, uri. The target is, to be able to parse <code>otus</code>, <code>otu</code>, and simple <code>trees</code>( just a tree with some nodes and edges ), and <code>class</code>. Focus will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.
  
Notes:
+
Notes:<br/>
Since the support for <code>class</code> elements is not complete in the schema, it will not be implemented in the parser.
+
* Since the support for <code>class</code> elements is not complete in the schema, it will not be implemented in the parser.
  
 
== Week 2 ( May 31 - June 6 ) ==
 
== Week 2 ( May 31 - June 6 ) ==
This week I will completely implement <code>trees</code> and <code>networks</code> including both their int and float variants with documentation.  
+
Planned:
I will start working on <code>characters</code> element as well. NeXML allows for two broad categories of data( sequence and granular observation ), each with six sub categories. Without keeping the type in mind the parser should be able to recognize: state definitions without ambiguity( <code>state</code> ), character definitions( <code>char</code> ), matrix rows with raw character sequences (<code>seq</code>) and granular observations( <code>obs</code> ). In parsing <code>characters</code> the focus will be on designing classes to abstract <code>characters</code> and its child elements, the actual parsing and unit tests and not on documentation.
+
* Completely implement <code>trees</code> and <code>networks</code> including both their int and float variants.
 +
 
 +
Start working on <code>characters</code> element. NeXML allows for two broad categories of data( sequence and granular observation ), each with six sub categories. Without keeping the type in mind the parser should be able to recognize:
 +
* state definitions - <code>states</code> and it child elements( maybe leave ambiguous definition - discuss with Rutger )
 +
* character definitions - <code>char</code>
 +
* matrix - <code>row</code>, raw character sequences (<code>seq</code>) and granular observations( <code>cell</code> )
 +
In parsing <code>characters</code> the focus will be on designing classes to abstract <code>characters</code> and its child elements, the actual parsing and unit tests and not on documentation.
  
 
== Week 3 ( June 7 - June 13 ) ==
 
== Week 3 ( June 7 - June 13 ) ==
* Design classes to encapsulate, and parse and return corresponding objects for:
+
Planned:
** Character block( <code>characters</code> )
+
* Completely implement <code>characters</code> with the supported types.
** <code>format</code>, <code>states</code>, <code>state</code>, <code>char</code>
+
* Document the code base and make sure that the parsing API is in place complete with tests and documentation.
** <code>matrix</code>, <code>row</code>
+
* Request for feedback from the BioRuby community( this will be done at the end of the week )
  
 
== Week 4 ( June 14 - June 20 )==
 
== Week 4 ( June 14 - June 20 )==
Make sure that the API for the parser is in place, with software development iterations, tests and documents.
+
Planned:<br/>
 +
Development of the NeXML serializer
  
== Week 5 ( June 21 - June 27 ) ==
 
Development of the NeXML serializer
 
 
* Extend the already designed classes to serialize:
 
* Extend the already designed classes to serialize:
 
** Taxa( <code>otu</code> )
 
** Taxa( <code>otu</code> )
Line 26: Line 31:
 
** Sets( <code>class</code> )
 
** Sets( <code>class</code> )
  
== Week 6 ( June 28 - July 4 ) ==
+
== Week 5 ( June 21 - June 27 ) ==
 
* Extend the already designed classes to serialize:
 
* Extend the already designed classes to serialize:
 
** Trees( <code>trees</code> )
 
** Trees( <code>trees</code> )
 
** Tree( <code>tree</code> ), Network( <code>network</code> ), Node( <code>node</code> ) and Edge( <code>edge</code> )
 
** Tree( <code>tree</code> ), Network( <code>network</code> ), Node( <code>node</code> ) and Edge( <code>edge</code> )
  
== Week 7 ( July 5 - July 11 ) ==
+
== Week 6 ( June 28 - July 4 ) ==
 
* Extend the already designed classes to serialize:
 
* Extend the already designed classes to serialize:
 
** Character block( <code>characters</code> )
 
** Character block( <code>characters</code> )
Line 37: Line 42:
 
** <code>matrix</code>, <code>row</code>
 
** <code>matrix</code>, <code>row</code>
  
== Week 8 ( July 12 - July 18 ) ==
+
== Week 7 ( July 5 - July 11 ) ==
 
Make sure that the API for the NeXML serializer is in place, with software development iterations, tests and documents.
 
Make sure that the API for the NeXML serializer is in place, with software development iterations, tests and documents.
  
== Week 9 ( July 19 - July 25 )==
+
== Week 8 ( July 12 - July 18 ) ==
 
Design classes for semantic annotation in BioRuby.
 
Design classes for semantic annotation in BioRuby.
  
== Week 10 ( July 26 - August 1 ) ==
+
== Week 9 ( July 19 - July 25 )==
 
* Parse <code>meta</code> NeXML element and return the corresponding object.
 
* Parse <code>meta</code> NeXML element and return the corresponding object.
 
* Serialize annotations into <code>meta</code> tag.
 
* Serialize annotations into <code>meta</code> tag.
 +
 +
== Week 10 ( July 26 - August 1 ) ==
 +
Make sure that the RDF API is in place, with software development iterations, tests and documents.
  
 
== Week 11 ( August 2 - August 8 ) ==
 
== Week 11 ( August 2 - August 8 ) ==
Make sure that the RDF API is in place, with software development iterations, tests and documents.
+
Tests and documentations.
  
 
== Week 12 ( August 9 - August 15 ) ==
 
== Week 12 ( August 9 - August 15 ) ==
Tests and documentations.
+
Feedback.
  
== References ==
+
== Technicalities ==
A discussion on API can be found here - [[NeXML and RDF API for BioRuby]]
+
[http://github.com/yeban/bioruby Github] is being used for code collaboration. Any NeXML file read for parser development is validated against the current NeXML schema to ensure correctness. The developed code is being unit tested with Ruby's unit testing framework and documentation generated using Rdoc. All NeXML element are documented here - [[NeXML Elements]] and an API discussion can be found here - [[NeXML and RDF API for BioRuby]].
  
 
[[Category:NeXML and RDF API for BioRuby]]
 
[[Category:NeXML and RDF API for BioRuby]]

Revision as of 14:32, 1 June 2010

Week 1 ( May 24 - May 30 )

Planned:
I will start the development of the NeXML parser this week. The parser should be able to accept NeXML in a couple of ways: file, io, string, uri. The target is, to be able to parse otus, otu, and simple trees( just a tree with some nodes and edges ), and class. Focus will be on designing classes to encapsulate these NeXML elements, the actual parsing and unit tests and not on documentation.

Notes:

  • Since the support for class elements is not complete in the schema, it will not be implemented in the parser.

Week 2 ( May 31 - June 6 )

Planned:

  • Completely implement trees and networks including both their int and float variants.

Start working on characters element. NeXML allows for two broad categories of data( sequence and granular observation ), each with six sub categories. Without keeping the type in mind the parser should be able to recognize:

  • state definitions - states and it child elements( maybe leave ambiguous definition - discuss with Rutger )
  • character definitions - char
  • matrix - row, raw character sequences (seq) and granular observations( cell )

In parsing characters the focus will be on designing classes to abstract characters and its child elements, the actual parsing and unit tests and not on documentation.

Week 3 ( June 7 - June 13 )

Planned:

  • Completely implement characters with the supported types.
  • Document the code base and make sure that the parsing API is in place complete with tests and documentation.
  • Request for feedback from the BioRuby community( this will be done at the end of the week )

Week 4 ( June 14 - June 20 )

Planned:
Development of the NeXML serializer

  • Extend the already designed classes to serialize:
    • Taxa( otu )
    • Taxa block( otus ) and
    • Sets( class )

Week 5 ( June 21 - June 27 )

  • Extend the already designed classes to serialize:
    • Trees( trees )
    • Tree( tree ), Network( network ), Node( node ) and Edge( edge )

Week 6 ( June 28 - July 4 )

  • Extend the already designed classes to serialize:
    • Character block( characters )
    • format, states, state, char
    • matrix, row

Week 7 ( July 5 - July 11 )

Make sure that the API for the NeXML serializer is in place, with software development iterations, tests and documents.

Week 8 ( July 12 - July 18 )

Design classes for semantic annotation in BioRuby.

Week 9 ( July 19 - July 25 )

  • Parse meta NeXML element and return the corresponding object.
  • Serialize annotations into meta tag.

Week 10 ( July 26 - August 1 )

Make sure that the RDF API is in place, with software development iterations, tests and documents.

Week 11 ( August 2 - August 8 )

Tests and documentations.

Week 12 ( August 9 - August 15 )

Feedback.

Technicalities

Github is being used for code collaboration. Any NeXML file read for parser development is validated against the current NeXML schema to ensure correctness. The developed code is being unit tested with Ruby's unit testing framework and documentation generated using Rdoc. All NeXML element are documented here - NeXML Elements and an API discussion can be found here - NeXML and RDF API for BioRuby.