Difference between revisions of "R Hackathon 1/Phylobase"

From Phyloinformatics
Jump to: navigation, search
(created)
 
 
(14 intermediate revisions by 4 users not shown)
Line 1: Line 1:
= Phylo4 class =
+
= Phylobase package =
  
 +
[I've renamed this from phylo4, and I'm now transcribing bits and pieces of the TODO list, and a variety of my own questions; I think it will be easier to 'discuss' in this format than in the TODO list within the package within the SVN repository. --[[User:Bbolker|Bbolker]] 22:57, 28 December 2007 (EST)]
 +
 +
== Class definitions ==
  
== Class definition ==
+
See [[R Hackathon 1/Data Standards|Data Standards]] page for definitions and discussion of the data classes (phylo4[d]/multiPhylo4[d]), including arguments about internal structure and some details on methods.
  
Marguerite has kindly started a description of the class on the [[Data standards]] page, I'm not sure whether to copy that here or move it here.
+
== To do ==
  
 +
The to do section has now been merged with the TODO file inside the package, please refer to that.  Feature requests can be place at the R-forge page [http://phylobase.r-forge.r-project.org/ phylobase]
  
== Tasks ==
+
[[Category:R Hackathon 1]]
 
 
=== Logistics/maintenance ===
 
 
 
* Talk to Emmanuel and figure out how he feels about phylo4: is he comfortable with phylo4 being set up as a parallel standard (and possibly eventually taking over more of the basic data manipulation stuff), but taking a lot of material from ape?  Does he have any interest in being the maintainer?
 
 
 
* Will this be a CRAN package?  Who will the maintainers be?  Will there be a "phylocore" ? (Discuss at end of week)
 
 
 
=== I/O ===
 
 
 
* Translation to/from phylo[ape] is working (although not tested much) -- translation to/from other packages is low priority since phylo[ape] is the * "Rosetta stone" (at least for now, may change somewhat if other packages end up wanting to use a richer data model)
 
 
 
* Liaison with i/o group: target input functions from Nexus/Newick to create phylo4 or phylo4d objects.  Final versions will have to wait on the data model. [Brian O'Meara]
 
 
 
=== Data class definition ===
 
 
 
* There seems to be consensus that the data model should be at least a little bit richer than a straight data frame, that we should create a new data class.
 
** We probably want at least one "metadata" tag (factor w/ levels binary, multistate, DNA, (nucleotide?) [? about Nexus definitions ?], amino acid, RNA, continuous, 'other'. 
 
** Two camps over how molecular data should be incorporated.  Do we want two slots -- one for molecular and one for non-molecular -- or do we want to stick the molecular data (especially long alignments) in a data frame as (e.g.) a character? Doing it the first way makes it hard to do subsetting operations transparently; doing it the second way makes handling genetic data harder (and wastes space etc.).  If we do it the first way we can just extend the data.frame class. A richer or more extendable would solve the problem, if it exists.  '''If we design the accessor functions really well, and people use them rather than digging into the guts, we can change things later.'''
 
 
 
=== Tree manipulation ===
 
 
 
We should look at the documentation group's lists of useful operations.
 
 
 
* Movement along branches, traversal, etc.: set up a "tagged" form of a tree that saves the current position on the tree (as an attribute or a slot?) and can be modified by operations like Up, Down?
 
* Drop/prune/na.omit
 
* Subsetting: by data characteristics or by tree
 
** get subtree with all descendants of a node
 
** get largest subtree containing species X and Y
 
** identify node graphically (locator/identify.node) to get subtree
 
** time slices: prune back by time
 
* (list from Wayne about what methods exist in Mesquite)
 
* What's available in other packages?
 
* node identification
 
* reordering (ordering may be an efficiency issue, but not for a while)
 
 
 
=== Testing ===
 
 
 
* end users need to start reading documentation, testing ...
 

Latest revision as of 14:49, 6 February 2008

Phylobase package

[I've renamed this from phylo4, and I'm now transcribing bits and pieces of the TODO list, and a variety of my own questions; I think it will be easier to 'discuss' in this format than in the TODO list within the package within the SVN repository. --Bbolker 22:57, 28 December 2007 (EST)]

Class definitions

See Data Standards page for definitions and discussion of the data classes (phylo4[d]/multiPhylo4[d]), including arguments about internal structure and some details on methods.

To do

The to do section has now been merged with the TODO file inside the package, please refer to that. Feature requests can be place at the R-forge page phylobase