R Hackathon 1/Package Overviews

From Phyloinformatics
Jump to: navigation, search

On this page, provide a brief description of the package, relevant programming info, future goals (note that this page is public), and anything else you think is relevant.



  • Used format: phylog
  • Conversion from: newick, hclust (package 'stats') and taxo (package 'ade4')
  • Essential features:
    • graphical representations: phylogeny represented alone (plot.phylog, radial.phylog) or together with quantitative variables (dotchart.phylog, symbols.phylog, table.phylog)
    • methods: computations of Abouheif's matrix (A) and other similarity/distance matrices and corresponding eigenvectors, orthogram.

Other (main) features

The phylogenetic features represent only one small component of ade4. The primary goal of ade4 is to implement multivariate methods using the duality diagram. It also proposes many graphical representations, and several spatial methods. Several packages extend ade4 toward different directions, like adehabitat (habitat analysis) or adegenet (molecular markers analysis).

More information on ade4 website.

--Jombart 12:23, 19 November 2007 (EST)


General information can be found at: http://ape.mpl.ird.fr/

and a svn repository is now available: https://svn.mpl.ird.fr/ape/

Current works include improving the likelihood calculations on trees for DNA substitution models. A continuous effort is put on improving what's already in ape in terms of functionalities, computing efficiency and integration with other programs, but currently no specific plan (problems are rather treated when they come).


From the apTreeshape documentation:

apTreeshape is mainly dedicated to simulation and analysis of phylogenetic tree topologies using statistical indices. It is a companion library of the 'ape' package. It provides additional functions for reading, plotting, manipulating phylogenetic trees. It also offers convenient web-access to public databases, and enables testing null models of macroevolution using corrected test statistics. Trees of class "phylo" (from 'ape' package) can be converted easily.


From the ComPairWise documentation:

Compares phylogenetic or population genetic data alignments


  • Uses phylo format
    • New version of geiger uses structure where tree plus phenotypes are associated as a list
  • Conversion to and from: ouch format
  • Essential features: analyses of diversification and simulation
  • Simulation
    • Simulate phylogenetic trees under birth-death model
    • Simulate character evolution (both discrete and continuous)
  • Analyses
    • Estimate speciation rates from age and extant diversity
    • Fit models of discrete and continuous trait evolution
    • Make disparity-through-time plots
  • New stuff in the pipeline:
    • Fit multi-rate models for continuous trait evolution to phylogenetic trees
    • Combine analyses over multiple trees and/or characters

--Lukeh@uidaho.edu 20:44, 26 November 2007 (EST)


Primary objectives of LASER: analyses of lineage diversification rates

Operates on branching times, which can be supplied by user or obtained from phylogenetic trees using ape functions.

Fits various incarnations of the birth-death model to branching times, including constant rate, multi-rate, and density-dependent variants.

Assesses departures from constant-rate diversification processes using AIC. Can simulate branching times under yule model for assessing significance, though this is largely irrelevant with Geiger's phylogenetic simulation capabilities.

A major expansion is planned - some of this is code from the analyses in Rabosky et al. 2007 (proc. soc. B. 274:2915-2923) as well as several other projects in the pipeline.

One of my objectives for the Hackathon is to update this package and implement (i) several other analyses of lineage diversification rates and (ii) tests for correlations between traits and rates.

~Dan Rabosky (DLR32 at cornell) 4 Dec 2007

Mesquite (not really an R package, but...)

... we have successfully got it to call R functions and respond to R functions

Following list of functions emphasizes inference and simulation calculations rather than visualization/management features. For more complete list see summary of features or list of modules

  • Character evolution
    • inference: ancestral state reconstruction
      • parsimony
        • categorical (unordered, ordered, cost matrices)
        • continuous (linear, least squares)
          • reconstruction of ancestral landmarks
          • "evolutionary pca"
      • likelihood
        • categorical (Mk1, Asymmetrical Mk)
    • inference: character correlation
      • categorical
        • pairwise comparisons
        • Pagel 1994
      • continuous
        • Felsenstein's Indep. Contrasts (through PDAP)
    • simulations: character evolution
      • categorical
        • Mk1, Asymm Mk model
        • standard DNA models (HKY 85, GTR, gamma, etc.)
      • continuous
        • Brownian motion
  • Speciation/Extinction
    • inference: diversification process estimation
      • birth/death likelihood inference
      • BiSSE model likelihood inference (state-dependent)
    • simulations with character controlling diversification:
      • categorical
        • BiSSE simulation (spec'n, ext'n, and character states)
      • continuous
        • Brownian motion evolution of speciation rate
    • simulations without characters:
      • Pure birth trees
      • Birth/death trees
  • Population genetics
    • inference for gene tree/species tree fit:
      • parsimony
        • deep coalescences
        • gene duplcation/extinctions
    • simulations:
      • coalescence within a population
      • coalescence within a population tree (with, without migration)
  • Randomizations
    • Characters
      • reshuffling states among characters
      • rarefactions (randomly deleting characters or state codings)
      • random noise to character codings
    • Trees
      • random branch moves
      • random noise to branch lengths
      • randomly reshuffle terminals
      • randomly rarefy tree
      • randomly augment tree
  • Other
    • basic multivariate analysis (PCA etc.)
    • many visualizations (trees, charts, matrices)
    • some tree inference (many criteria)
    • file format translations
    • manual and automated sequence alignment
  • in development
    • integrated sequence proofreading with chromatograms
    • direct links to tree inference packages
    • database connectivity for collaborative projects


This package allows one to fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree. These models represent the simplest extension beyond simple Brownian Motion. They incorporate both "selection" and "drift". Moreover, they allow for mutliple selective regimes, each of which is specified by an optimal trait value. A facility for fitting a Brownian motion model is also provided. There are two datasets that come with the package.

The latest version of 'ouch' has not yet been released to CRAN. It is a major extension that accomodates multivariate quantitative characters. It is written using S4 classes. Here is a link to the repository in which the latest version can be found: [1]. This code is provided for hackathon use only and is not yet ready for public consumption (but will be soon). Please contact me (kingaa at umich dot edu) if you have questions about this.

The nonlinear likelihood maximization which is the key part of the algorithm is more difficult in the multivariate case than it was in the univariate case. We have found R's native 'optim' routine to be insufficiently accurate in some examples. For this reason, the package currently depends on another of my packages, 'subplex'. This package implements a very powerful subspace-searching simplex algorithm due to Tom Rowan (ORNL) and will be released to CRAN soon. For now, one can download it from the same repository.


From the PaleoTS documentation:

This package facilitates analysis of paleontological sequences of trait values from an evolving lineage. Functions are provided to fit, using maximum likelihood, evolutionary models including unbiased random walks, directional evolution and stasis models.

This package does some of the fitting for phenotypic evolutionary models as OUCH and ape, but applied to ancestor-descendant series of populations, rather than phylogenetically-related populations.


From the PhyloGR documentation:

Manipulation and analysis of phylogenetically simulated data sets and phylogenetically based analyses using GLS.

These programs provide easy reading, manipulation, and plotting of simulated data sets, as well as functions to fit statistical models to those simulated data sets (including linear models of any complexity; principal components analysis; canonical correlation analysis; generalized least squares).


From the PhySim documentation:

PhySim contains functions to simulate phylogenetic trees under a virth death model. Functions are provided to model a lag-time to speciation and extract sister species ages.


Picante is a package that includes functions for Phylocom integration, community analysis, null-models, traits and evolution in R.

Main functions of relevance to the hackathon are functions for simulating trait evolution (Brownian, OU, bounded Brownian, ACDC), phylogenetic signal (K, variance of contrasts), tools for tree manipulation (pruning, node ages, shared history), community phylogenetic analyses, PIC diagnostics, and interface with comparative analyses in Phylocom.