Phylogenetic Footprinting Documentation

From Phyloinformatics
Revision as of 13:54, 22 December 2006 by Bosborne11@verizon.net (talk) (BioPerl)
Jump to: navigation, search

Introduction

This Phyloinformatic Hackathon page describes how one might discover and characterize conserved sequence features shared in related genomes or genomic sequences. Footprinting and Shadowing are similar but typically differ in the number of sequences analyzed and the relatedness of those sequences.

Phylogenetic Footprinting

Phylogenetic footprinting seeks to find regulatory sequences or specific features of non-coding DNA by analyzing DNA within and around aligned, orthologous genes. The assumption is that mutation is more tolerated outside of critical sequence regions, thus sequences conserved between species are likely to be critical sequences. The species being studied need to be sufficently diverged that mutation has acted appreciably yet sufficently related that the relationships between or RNA- or DNA-binding proteins and their corresponding bound sequences have not significantly changed.

Phylogenetic Shadowing

Phylogenetic shadowing differs in that the group of species in question ideally contains both closely related and more divergent species. The idea is that by combining data from various pairwise comparisons one should observe enough mutation that conserved regions can be identified even if there is no exact alignment between the most divergent pairs. Thus part of the result from this type of analysis is a shadow, a conserved region shared by all species that lacks base-pair definition.

Applications

Computational Steps

  1. Identify and extract orthologous sequences through genome synteny or by some other means.
  2. Align orthologous sequences.
  3. Obtain a tree corresponding to the alignment.
  4. Calculate total tree length of the alignment on the tree.
  5. Implement the tree-length calculation in a sliding-window fashion.
  6. Identify regions that significantly deviate from the average tree-length.

Hackathon Contributions

BioPerl

This stepwise description can be done using Clustalw. The BioPerl group has provided a new method, footprint(), in the Bio::Tools::Run::Alignment::Clustalw module, found in the bioperl-run package of BioPerl, which can do either phylogenetic footprinting or phylogenetic shadowing.

The Bioperl group has also fixed the run() method in Footprint.pm, found in the bioperl-run package of BioPerl, so this module can now be used for phylogenetic footprinting.