Difference between revisions of "R Hackathon 1/PGLS"

From Phyloinformatics
Jump to: navigation, search
Line 5: Line 5:
 
Let's return to the ''Geospiza'' dataset (within the geiger package) to try PGLS.  We assume that you have already loaded the necessary packages (geiger for the data and ape for the function) as described on [https://www.nescent.org/wg_phyloinformatics/R_Hackathon/TransitionProbability this page]. Let's say we want to test whether there is a significant relationship between wing length and tarsus length, accounting for possible dependence among the data points (trait values) due to phylogenetic relatedness.
 
Let's return to the ''Geospiza'' dataset (within the geiger package) to try PGLS.  We assume that you have already loaded the necessary packages (geiger for the data and ape for the function) as described on [https://www.nescent.org/wg_phyloinformatics/R_Hackathon/TransitionProbability this page]. Let's say we want to test whether there is a significant relationship between wing length and tarsus length, accounting for possible dependence among the data points (trait values) due to phylogenetic relatedness.
  
First, we will create a data frame containing our traits of interest, with the row.names matching the tip.labels.  NOTE:  
+
First, we will create a data frame containing our traits of interest, with the row.names matching the tip.labels.  NOTE: you must associate the taxon names with the trait values so that the values can be correctly tied to the tips of the tree!  See also [this page].
 +
 
 +
IMPORTANT. The row.names of your dataframe must match the tip.labels of your phylogeny. In the example above, the row.names for geodata will match the tip.labels for geotree after "olivacea" has been culled. However, the individual columns of geodata (e.g. geodata$wingL) do not automatically have the row.names of the whole data table associated with them! If you call a column of the data.table with ace without first dealing with this issue, the analysis will run, but the tip data will be disassociated from the proper tips!
  
  

Revision as of 13:49, 13 December 2007

Phylogenetic Generalized Least Squares

PGLS is a powerful method for analyzing continuous data that has been applied to estimating adaptive optima (Butler and King 2004) and estimating the relationships among traits (e.g., body size and geographic range size in carnivores). PGLS allows the user to specify different ways in which the tree structure is expected to affect the covariance in trait values across taxa. For example, the user might assume that the trait evolves by Brownian motion and thus that the trait covariance between any pair of taxa decreases linearly with the time (in branch length) since their divergence. Alternately, the user might apply a Ornstein-Uhlenbeck model where the expected covariance decreases exponentially, as governed by the parameter alpha (Martins and Hansen 1997). These methods are implemented in the ape package.

Let's return to the Geospiza dataset (within the geiger package) to try PGLS. We assume that you have already loaded the necessary packages (geiger for the data and ape for the function) as described on this page. Let's say we want to test whether there is a significant relationship between wing length and tarsus length, accounting for possible dependence among the data points (trait values) due to phylogenetic relatedness.

First, we will create a data frame containing our traits of interest, with the row.names matching the tip.labels. NOTE: you must associate the taxon names with the trait values so that the values can be correctly tied to the tips of the tree! See also [this page].

IMPORTANT. The row.names of your dataframe must match the tip.labels of your phylogeny. In the example above, the row.names for geodata will match the tip.labels for geotree after "olivacea" has been culled. However, the individual columns of geodata (e.g. geodata$wingL) do not automatically have the row.names of the whole data table associated with them! If you call a column of the data.table with ace without first dealing with this issue, the analysis will run, but the tip data will be disassociated from the proper tips!


We will first build the correlation structure expected if the traits evolve by Brownian motion.

library(geiger)