Difference between revisions of "R Hackathon 1/Programming Goals"

From Phyloinformatics
Jump to: navigation, search
Line 1: Line 1:
One major goal of the hackathon is establishing stable standards for trees, networks, data, and perhaps model descriptions and analysis settings. When we are done, we should be able to load a tree and character data from a NEXUS file (see [[Supporting_NEXUS_Documentation]] and [[NEXUS Specification]] and then pass this information from one package to another without requiring conversion. New R packages for comparative methods will use this standard; existing packages will ideally be modified to meet this standard, but will minimally be able to easily convert to and from this standard from and to their internal representations.
+
'''Creating a standard:'''
 +
 
 +
One major goal of the hackathon is establishing stable standards for trees, networks, data, and perhaps model descriptions and analysis settings. When we are done, we should be able to load a tree and character data from a NEXUS file (see [[Supporting_NEXUS_Documentation]] and [[NEXUS Specification]]) and then pass this information from one package to another without requiring conversion. New R packages for comparative methods will use this standard; existing packages will ideally be modified to meet this standard, but will minimally be able to easily convert to and from this standard from and to their internal representations. Desired features include rooting, branch lengths, tree weights, information about ancestral state reconstructions/assignments, and labels.
  
 
We have a test set of sample trees and datasets which contain various formats for input (trees with and without branch lengths, rooting, labels, etc.). In order to better get an idea of the structure of trees and data within each package, we'd like to get the internal representation of the files from each package, as well as notes regarding the procedure for loading the files and what was lost (for example, is it clear which trees are rooted and unrooted?). See, for an example of a description of a format, Paradis' description of the [http://pbil.univ-lyon1.fr/R/ape/misc/FormatTreeR_4Dec2006.pdf phylo class] (pdf) or the scheme for coding [http://pbil.univ-lyon1.fr/R/ape/misc/BitLevelCodingScheme_20April2007.pdf nucleotides] (pdf) in [http://pbil.univ-lyon1.fr/R/ape/ APE].
 
We have a test set of sample trees and datasets which contain various formats for input (trees with and without branch lengths, rooting, labels, etc.). In order to better get an idea of the structure of trees and data within each package, we'd like to get the internal representation of the files from each package, as well as notes regarding the procedure for loading the files and what was lost (for example, is it clear which trees are rooted and unrooted?). See, for an example of a description of a format, Paradis' description of the [http://pbil.univ-lyon1.fr/R/ape/misc/FormatTreeR_4Dec2006.pdf phylo class] (pdf) or the scheme for coding [http://pbil.univ-lyon1.fr/R/ape/misc/BitLevelCodingScheme_20April2007.pdf nucleotides] (pdf) in [http://pbil.univ-lyon1.fr/R/ape/ APE].
  
  
Interaction:
+
'''Interaction:'''
  
 
     * Mesquite calling R modules.
 
     * Mesquite calling R modules.
Line 10: Line 12:
 
     * R calling existing software (PAUP? MrBayes? others?)
 
     * R calling existing software (PAUP? MrBayes? others?)
  
New functions
+
'''New functions'''
  
 
     * [What's missing]
 
     * [What's missing]
  
Visualization
+
'''Visualization'''

Revision as of 16:28, 12 November 2007

Creating a standard:

One major goal of the hackathon is establishing stable standards for trees, networks, data, and perhaps model descriptions and analysis settings. When we are done, we should be able to load a tree and character data from a NEXUS file (see Supporting_NEXUS_Documentation and NEXUS Specification) and then pass this information from one package to another without requiring conversion. New R packages for comparative methods will use this standard; existing packages will ideally be modified to meet this standard, but will minimally be able to easily convert to and from this standard from and to their internal representations. Desired features include rooting, branch lengths, tree weights, information about ancestral state reconstructions/assignments, and labels.

We have a test set of sample trees and datasets which contain various formats for input (trees with and without branch lengths, rooting, labels, etc.). In order to better get an idea of the structure of trees and data within each package, we'd like to get the internal representation of the files from each package, as well as notes regarding the procedure for loading the files and what was lost (for example, is it clear which trees are rooted and unrooted?). See, for an example of a description of a format, Paradis' description of the phylo class (pdf) or the scheme for coding nucleotides (pdf) in APE.


Interaction:

   * Mesquite calling R modules.
   * R calling Mesquite modules.
   * R calling existing software (PAUP? MrBayes? others?)

New functions

   * [What's missing]

Visualization