A Mixed Branch Length Model of Heterotachy Improves Phylogenetic Accuracy

Open Access

3 March 2008

journal article
research article
Published by Oxford University Press (OUP) in Molecular Biology and Evolution

Vol. 25 (6), 1054-1066
https://doi.org/10.1093/molbev/msn042

Abstract

Evolutionary relationships are typically inferred from molecular sequence data using a statistical model of the evolutionary process. When the model accurately reflects the underlying process, probabilistic phylogenetic methods recover the correct relationships with high accuracy. There is ample evidence, however, that models commonly used today do not adequately reflect real-world evolutionary dynamics. Virtually all contemporary models assume that relatively fast-evolving sites are fast across the entire tree, whereas slower sites always evolve at relatively slower rates. Many molecular sequences, however, exhibit site-specific changes in evolutionary rates, called “heterotachy.” Here we examine the accuracy of 2 phylogenetic methods for incorporating heterotachy, the mixed branch length model—which incorporates site-specific rate changes by summing likelihoods over multiple sets of branch lengths on the same tree—and the covarion model, which uses a hidden Markov process to allow sites to switch between variable and invariable as they evolve. Under a variety of simple heterogeneous simulation conditions, the mixed model was dramatically more accurate than homotachous models, which were subject to topological biases as well as biases in branch length estimates. When data were simulated with strong versions of the types of heterotachy observed in real molecular sequences, the mixed branch length model was more accurate than homotachous techniques. Analyses of empirical data sets confirmed that the mixed branch length model can improve phylogenetic accuracy under conditions that cause homotachous models to fail. In contrast, the covarion model did not improve phylogenetic accuracy compared with homotachous models and was sometimes substantially less accurate. We conclude that a mixed branch length approach, although not the solution to all phylogenetic errors, is a valuable strategy for improving the accuracy of inferred trees.

Keywords

This publication has 73 references indexed in Scilit:

Evaluation of the models handling heterotachy in phylogenetic inference
BMC Ecology and Evolution, 2007
Mechanistic approaches to the study of evolution: the functional synthesis
Nature Reviews Genetics, 2007
Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model
BMC Ecology and Evolution, 2007
Heterotachy in Mammalian Promoter Evolution
PLoS Genetics, 2006
A call for likelihood phylogenetics even when the process of sequence evolution is heterogeneous
Molecular Phylogenetics and Evolution, 2005
On the Correlation Between Composition and Site-Specific Evolutionary Rate: Implications for Phylogenetic Inference
Molecular Biology and Evolution, 2005
Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous
Nature, 2004
Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters
Mathematical Biosciences, 1996
Estimating the Dimension of a Model
The Annals of Statistics, 1978
A new look at the statistical model identification
IEEE Transactions on Automatic Control, 1974

Cited by 80 articles