Confirming the Phylogeny of Mammals by Use of Large Comparative Sequence Data Sets
Open Access
- 2 May 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 25 (9), 1795-1808
- https://doi.org/10.1093/molbev/msn104
Abstract
The ongoing generation of prodigious amounts of genomic sequence data from myriad vertebrates is providing unparalleled opportunities for establishing definitive phylogenetic relationships among species. The size and complexities of such comparative sequence data sets not only allow smaller and more difficult branches to be resolved but also present unique challenges, including large computational requirements and the negative consequences of systematic biases. To explore these issues and to clarify the phylogenetic relationships among mammals, we have analyzed a large data set of over 60 megabase pairs (Mb) of high-quality genomic sequence, which we generated from 41 mammals and 3 other vertebrates. All sequences are orthologous to a 1.9-Mb region of the human genome that encompasses the cystic fibrosis transmembrane conductance regulator gene (CFTR). To understand the characteristics and challenges associated with phylogenetic analyses of such a large data set, we partitioned the sequence data in several ways and utilized maximum likelihood, maximum parsimony, and Neighbor-Joining algorithms, implemented in parallel on Linux clusters. These studies yielded well-supported phylogenetic trees, largely confirming other recent molecular phylogenetic analyses. Our results provide support for rooting the placental mammal tree between Atlantogenata (Xenarthra and Afrotheria) and Boreoeutheria (Euarchontoglires and Laurasiatheria), illustrate the difficulty in resolving some branches even with large amounts of data (e.g., in the case of Laurasiatheria), and demonstrate the valuable role that very large comparative sequence data sets can play in refining our understanding of the evolutionary relationships of vertebrates.Keywords
This publication has 92 references indexed in Scilit:
- Genomics, biogeography, and the diversification of placental mammalsProceedings of the National Academy of Sciences, 2007
- High-resolution species trees without concatenationProceedings of the National Academy of Sciences, 2007
- Microinversions in mammalian evolutionProceedings of the National Academy of Sciences, 2006
- RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed modelsBioinformatics, 2006
- Pegasoferae, an unexpected mammalian clade revealed by tracking ancient retroposon insertionsProceedings of the National Academy of Sciences, 2006
- The UCSC Table Browser data retrieval toolNucleic Acids Research, 2004
- Primate phylogeny, evolutionary rate variations, and divergence times: A contribution from the nuclear gene IRBPAmerican Journal of Physical Anthropology, 2003
- The Human Genome Browser at UCSCGenome Research, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic InferenceMolecular Biology and Evolution, 1999