R Hackathon 1/Diversification

From Phyloinformatics
Revision as of 18:01, 14 December 2007 by Spric (talk)
Jump to: navigation, search

Analyses of Diversification are currently (Dec. 07) divided between several different packages in R: ape, apTreeshape, laser and geiger. We will use a different dataset and tree from the other examples as the analyses will make more sense with a larger tree with branch times in years.

Before running these analyses think carefully about how polytomies or zero length branches might affect the results.

First make sure that geiger, apTreeshape, laser and geiger are loaded into r

  > library(ape)
  > library(geiger)
  > library(apTreeshape)
  > library(laser)

Unlike other comparative methods packages in r, laser currently (Dec. 07) doesn't work on a phylogenetic tree (represented as a phylo class in ape) instead it works upon branching times. To get branching times from a tree with branch lengths (genetic distance or years) use the function branching.times in ape. This will calculate branching times associated with each node in the phylogeny and place them in an object.

  > mybranchtimes <- branching.times(mytree)

Laser needs the string of branching times to be ordered from 1st (earliest in the phylogeny) to last (most recent), if your string is not ordered it will automatically do this for you.

How do I make a lineage through time plot?

Lineage through time plots can be done in ape and laser. These can be used to graphically depict whether diversification has been constant through time. This example uses the function ltt.plot in ape.

  > ltt.plot(mytree, log="y")

The number of lineages will obviously increase over time if the phylogeny contains only extant species as no extinction is observed. When the log of the number of lineages is plotted through time a straight line indicates diversification has been constant over time. Deviations from the straight line indicate diversification has been constant, if the observed plot lies above the straight line diversification rates have decreased over time and if the plot lives below the line diversification rates have increased over time. WARNING if diversification rates are heterogeneous then lineage-through-time plots are difficult to interpret.

If you want to add another lineage-through-time plot to the same graph you can use the ltt.lines function in ape (this assumes you have more than one tree).

  > ltt.plot(mytree1, log="y")
  > ltt.lines(mytree2, lty=2)

Notice that specifying that the y axis is logarhithmic in ltt.plot means that the second line is also plotted on a log scale. The lty command specifies the type of line so that the second ltt plot can be distinguished from the first plot.

How do I calculate topological measures of diversification?

apTreeshape contains functions for implementing Colless and Sackin's methods for topological measures of diversification as well as tests for significant shifts in diversification (sensu Moore, Chan and Donoghue 2004).

How do I calculate shifts in diversification rate?

How do I calculate net rates of diversification?

geiger contains a function to calculate net rates of diversification (sensu Magellon and Sanderson).

How do I fit a simple pure birth model?

A pure birth model (Yule model) assumes that there is no extinction and there is an instantaneous speciation probability. A birth-death model assumes that there is an instantaneous speciation probability and an instantaneous extinction probability. There are various modifications to the birth-death model that allow the probabilty of extinction and speciation to vary over time (Nee et al., 1994) or for the speciation probability to vary with respect to species traits (Paradis, 2005).

To fit a Yule model to a set of branching times using laser use the function pureBirth.

  > yulemodel<- pureBirth(mybranchtimes)

This gives you the maximum log-likelihood, the Akaike Information Criterion (AIC) and the speciation rate given the maximum log-likelihood.

WARNING - don't use the Yule function in ape as it will return an erroneous log-likelihood although the estimated birth rate will be correct.

How do I fit a pure birth-death model?

To fit a multi-rate pure birth model to a set of branching times using laser use the function yule-n-rate. For example, specifying a yule2rate assumes that a clade diversifies under a speciation rate until some time where it shifts to a new rate, this shift point is found by optimizing parameters and computing likelihoods for a set of possible shift time and selecting the parameter combinations that give the maximum log-likelihood.

  > yulerate2 <- yule2rate(mybranchtimes)

This function requires that your tree is fully dichotomous, if you have zero length branches you might find that several identical branching times are identified as the time when the rate shift occurs this will be because they belong to the same polytomy. The shift times are given given in the divergence units (oftentimes years or genetic divergence) before present.

To fit a simple birth-death model with the Nee et al. (1994) method to take into account the difficulty of estimateing maximum likelihood parameters when only the extant species are observed, use the birthdeath function in ape.

  > birthdeathmytree <- birthdeath(mytree)

How do I calculate the gamma statistic of Pybus & Harvey 2001?


Nee, S., May, R. M. & Harvey, P. H. (1994) The reconstructed evolutionary process. Phil. Trans. Roy. Soc. Lond. B. 349, 25-31.

Paradis, E. (2005) Statistical analysis of diversification with species traits. Evolution 59, 1-12.