Difference between revisions of "R Hackathon 1/ContinuousData"

From Phyloinformatics
Jump to: navigation, search
Line 24: Line 24:
 
Currently, there are three packages that implement inference of the BM model: ape in function ace(), geiger in function fitContinuous(), and ouch in function brown.fit().  The function ace() will return a log-likelihood and a parameter estimate for the BM model, but these are conditional upon the reconstructed trait values of the nodes.  This is consistent with the main purpose of ace(), but these results are not comparable with other model fits (which are not usually conditioned in this way).  Thus, unless your main focus is on ancestral reconstructions, you should probably be using the functions in either geiger or ouch to fit the BM model.
 
Currently, there are three packages that implement inference of the BM model: ape in function ace(), geiger in function fitContinuous(), and ouch in function brown.fit().  The function ace() will return a log-likelihood and a parameter estimate for the BM model, but these are conditional upon the reconstructed trait values of the nodes.  This is consistent with the main purpose of ace(), but these results are not comparable with other model fits (which are not usually conditioned in this way).  Thus, unless your main focus is on ancestral reconstructions, you should probably be using the functions in either geiger or ouch to fit the BM model.
  
The functions in geiger and ouch share the approach of not conditioning the model fits on particular values at ancestral nodes.  However, although the log-likelihoods and rate estimates tend to be similar across these two packages, they are not the same.  This difference seems to result from the different computational strategies used in the functions: fitContinuous() in geiger directly maximizes the likelihood, while brown.fit() in ouch uses generalized least-squares (GLS) to solve for the rate parameter and the log-likelihood.  In principle, GLS and ML should approach the same solution, but in practice they will differ somewhat (Check with Marguerite on this).
+
The functions in geiger and ouch share the approach of not conditioning the model fits on particular values at ancestral nodes.  Although their computational strategies differ somewhat, they seem to produce log-likelihoods and rate estimates that are very nearly the same (note that rate parameter in ouch, "sigma", is the square root of the rate parameter in geiger, "beta").
  
 
'''Ornstein-Uhlenbeck'''
 
'''Ornstein-Uhlenbeck'''
  
 
Both geiger and ouch can fit OU models with a single optimum (only ouch can currently fit OU models with more than one optimum).  Again, the approaches differ in algorithmic approach, and they do not yield the same solutions.  In general, this appears to be a challenging optimization problem, so users should evaluate model fits carefully. (More to come here)
 
Both geiger and ouch can fit OU models with a single optimum (only ouch can currently fit OU models with more than one optimum).  Again, the approaches differ in algorithmic approach, and they do not yield the same solutions.  In general, this appears to be a challenging optimization problem, so users should evaluate model fits carefully. (More to come here)

Revision as of 15:53, 13 December 2007

Inferring Models for Continuous Characters in R

Introductory material on continuous models: Brownian motion (BM), Pagel transformations, Ornstein-Uhlenbeck models.


Fitting and Comparing Brownian Motion to Other Models

Suggested examples: (1) fitting simple BM (2) fitting a Pagel transformation (3) fitting an OU-1 model (4) Show model comparisons among 1-3 using AIC


A Note about Multiple Implementations of these Methods

Currently, there are multiple implementations of several of these models across different packages. The results are not always the same across methods, however, because either the purpose of the analysis is slightly different across packages, or the implementations use different computational strategies. The models that currently have multiple implementations as of this writing (13 December 2007) are (i) Brownian motion and (ii) Ornstein-Uhlenbeck with one optimum.

Brownian motion

Currently, there are three packages that implement inference of the BM model: ape in function ace(), geiger in function fitContinuous(), and ouch in function brown.fit(). The function ace() will return a log-likelihood and a parameter estimate for the BM model, but these are conditional upon the reconstructed trait values of the nodes. This is consistent with the main purpose of ace(), but these results are not comparable with other model fits (which are not usually conditioned in this way). Thus, unless your main focus is on ancestral reconstructions, you should probably be using the functions in either geiger or ouch to fit the BM model.

The functions in geiger and ouch share the approach of not conditioning the model fits on particular values at ancestral nodes. Although their computational strategies differ somewhat, they seem to produce log-likelihoods and rate estimates that are very nearly the same (note that rate parameter in ouch, "sigma", is the square root of the rate parameter in geiger, "beta").

Ornstein-Uhlenbeck

Both geiger and ouch can fit OU models with a single optimum (only ouch can currently fit OU models with more than one optimum). Again, the approaches differ in algorithmic approach, and they do not yield the same solutions. In general, this appears to be a challenging optimization problem, so users should evaluate model fits carefully. (More to come here)