R Hackathon 1/Trait Evolution SG

From Phyloinformatics
Revision as of 16:37, 24 January 2008 by Lukeh@uidaho.edu (talk)
Jump to: navigation, search
  • Participants: Harmon, Hipp, Hunt


  1. Compare various implementations of the same methods (ape, geiger, OUCH, Mesquite)
  2. Improve functionality of character fitting in r
  3. Identify gaps in current implementation


  1. Evaluated the results of continuous character analyses in different packages
    • Packages are mostly consistent
    • Discrepancies come from two sources:
      • Different approaches (e.g. marginal versus joint likelihood)
      • Difficulties in finding the ML solution
    • For continuous characters:
      • geiger and OUCH tend to return the same parameter estimates
      • But they return different likelihoods
    • For discrete characters
      • geiger and mesquite are consistent, returning the same parameter estimates and likelihoods
      • geiger and ape are different
      • ape is reporting the joint likelihoods for ancestral states. This uses the single set of ancestral states that together result in the highest likelihood on the whole tree.
      • mesquite and geiger use marginal likelihoods for ancestral states. This represents the likelihood averaged over all possible ancestral character state values.
      • This also means that you get different ancestral state reconstructions from ape and mesquite
  2. Improved functionality
    • geiger was modified to give more reliable results by a more thorough search of the likelihood surface (fitContinuous)
    • geiger can deal with a more general set of discrete character models (fitDiscrete)
    • geiger's tree transformations now work for nonultrametric trees
  3. Identify gaps in current implementation
    • The main gap, from an end-user perspective, is obtaining estimates of ancestral character states in r
    • ape does this, but only for joint likelihoods, and the function sometimes has trouble finding the ML solution
    • There is no way to get marginal ancestral character states for discrete characters in r other than interfacing with Mesquite

To do

  1. Implement "white noise" model in geiger's fitContinuous
  2. Investigate statistical properties of these methods
    • Which models can we tell apart?
    • How much data is needed?
    • Are parameter estimates biased?