On Reduced Amino Acid Alphabets for Phylogenetic Inference
- 23 May 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 24 (9), 2139-2150
- https://doi.org/10.1093/molbev/msm144
Abstract
We investigate the use of Markov models of evolution for reduced amino acid alphabets or bins of amino acids. The use of reduced amino acid alphabets can ameliorate effects of model misspecification and saturation. We present algorithms for 2 different ways of automating the construction of bins: minimizing criteria based on properties of rate matrices and minimizing criteria based on properties of alignments. By simulation, we show that in the absence of model misspecification, the loss of information due to binning is found to be insubstantial, and the use of Markov models at the binned level is found to be almost as effective as the more appropriate missing data approach. By applying these approaches to real data sets where compositional heterogeneity and/or saturation appear to be causing biased tree estimation, we find that binning can improve topological estimation in practice.Keywords
This publication has 18 references indexed in Scilit:
- Matched-pairs tests of homogeneity with applications to homologous nucleotide sequencesBioinformatics, 2006
- The Comparison of the Confidence Regions in PhylogenyMolecular Biology and Evolution, 2005
- Identifying the Basal Angiosperm Node in Chloroplast Genome Phylogenies: Sampling One's Way Out of the Felsenstein ZoneMolecular Biology and Evolution, 2005
- Chloroplast genome phylogenetics: why we need independent approaches to plant molecular evolutionTrends in Plant Science, 2005
- The place of Amborella within the radiation of angiospermsTrends in Plant Science, 2005
- Multigene Analyses of Bilaterian Animals Corroborate the Monophyly of Ecdysozoa, Lophotrochozoa, and ProtostomiaMolecular Biology and Evolution, 2005
- A new criterion and method for amino acid classificationJournal of Theoretical Biology, 2004
- The Chloroplast Genome of Nymphaea alba: Whole-Genome Analyses and the Problem of Identifying the Most Basal AngiospermMolecular Biology and Evolution, 2004
- Analysis of the Amborella trichopoda Chloroplast Genome Sequence Suggests That Amborella Is Not a Basal AngiospermMolecular Biology and Evolution, 2003
- Testing Substitution Models Within a Phylogenetic TreeMolecular Biology and Evolution, 2003