PhyloSoC: NeXML to MIAPA mapping

From Phyloinformatics
Jump to: navigation, search

This is the Phyloinformatics Summer of Code project page for the NeXML to MIAPA mapping & ISAtab transformation project. You'll find links to various project documentation and resources below. If you're looking for the code, go to the project GitHub page.

Summary The project is intended to support MIAPA, the emerging phylogenetics minimum information standard, via a tool to transform NeXML files into ISAtab files compatible with the ISAtools software suite. ISAtools is a data sharing, curation, and reuse platform used successfully in other genetics communities. Accessing it will allow phylogenetics to participate more fully in the larger genetics and data curation communities.

The most difficult part of the project turned out to be mapping NeXML, a extensible XML specification explicitly designed for phylogenetics, to ISAtab, a generic, tab-delimited format intended to represent scientific studies in Investigation, Study, and Assay logical units. While well-suited to wetlab experiments, it was not immediately clear how phylogenetic inference should be mapped to the ISA paradigm. The Logical Mapping that I developed with input from mentors and the community treated character state matrix rows as 'sources' equivalent to specimens, the matrices themselves as 'samples' equivalent to portions/combinations of these specimens prepared for analysis, and the phylogenetic inference as 'assays' using the 'technology' of inference software and the 'methodology' of inference.

Although this project represents several steps forward for minimum information standards and phylogenetic data sharing and reuse, uch remains for the phylogenetic community to do: each of the steps in this mapping must be linked to ontologies, and the ISA configuration modified to parse and validate these ontologies.

Student: Elliott Hauser

Mentor: Hilmar Lapp


A high level NeXML to ISAtab Mapping
More Detailed Logical Mapping
All google docs related to the project are available here.

Other Resources
See and edit my original application (with comments from mentors) here
Preliminary notes in Prezi here.
See TreeBASE terms and predicates here