Fast and Consistent Estimation of Species Trees Using Supermatrix Rooted Triples
Open Access
- 15 October 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 27 (3), 552-569
- https://doi.org/10.1093/molbev/msp250
Abstract
Concatenated sequence alignments are often used to infer species-level relationships. Previous studies have shown that analysis of concatenated data using maximum likelihood (ML) can produce misleading results when loci have differing gene tree topologies due to incomplete lineage sorting. Here, we develop a polynomial time method that utilizes the modified mincut supertree algorithm to construct an estimated species tree from inferred rooted triples of concatenated alignments. We term this method SuperMatrix Rooted Triple (SMRT) and use the notation SMRT-ML when rooted triples are inferred by ML. We use simulations to investigate the performance of SMRT-ML under Jukes–Cantor and general time-reversible substitution models for four- and five-taxon species trees and also apply the method to an empirical data set of yeast genes. We find that SMRT-ML converges to the correct species tree in many cases in which ML on the full concatenated data set fails to do so. SMRT-ML can be conservative in that its output tree is often partially unresolved for problematic clades. We show analytically that when the species tree is clocklike and mutations occur under the Cavender–Farris–Neyman substitution model, as the number of genes increases, SMRT-ML is increasingly likely to infer the correct species tree even when the most likely gene tree does not match the species tree. SMRT-ML is therefore a computationally efficient and statistically consistent estimator of the species tree when gene trees are distributed according to the multispecies coalescent model.Keywords
This publication has 60 references indexed in Scilit:
- Gene tree discordance, phylogenetic inference and the multispecies coalescentTrends in Ecology & Evolution, 2009
- Properties of Consensus Methods for Inferring Species Trees from Gene TreesSystematic Biology, 2009
- IS A NEW AND GENERAL THEORY OF MOLECULAR SYSTEMATICS EMERGING?Evolution, 2009
- BEST: Bayesian estimation of species trees under the coalescent modelBioinformatics, 2008
- Subdivision in an Ancestral Species Creates Asymmetry in Gene TreesMolecular Biology and Evolution, 2008
- Discordance of Species Trees with Their Most Likely Gene Trees: The Case of Five TaxaSystematic Biology, 2008
- Rooted triple consensus and anomalous gene treesBMC Ecology and Evolution, 2008
- High-resolution species trees without concatenationProceedings of the National Academy of Sciences, 2007
- Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneousNature, 2004
- Gene Trees in Species TreesSystematic Biology, 1997