Application of the Character Compatibility Approach to Generalized Molecular Sequence Data: Branching Order of the Proteobacterial Subdivisions
Open Access
- 9 December 2006
- journal article
- Published by Springer Nature in Journal of Molecular Evolution
- Vol. 64 (1), 90-100
- https://doi.org/10.1007/s00239-006-0082-2
Abstract
The character compatibility approach, which removes all homoplasic characters and involves finding the largest clique of compatible characters in a dataset, in principle, provides a powerful means for obtaining correct topology in difficult to resolve cases. However, the usefulness of this approach to generalized molecular sequence data for phylogeny determination has not been studied in the past. We have used this approach to determine the topology of 23 proteobacterial species (6 each of α-, β- and γ-, 3 δ-, and 2 ε-proteobacteria) using sequence data for 10 conserved proteins (Hsp60, Hsp70, EF-Tu, EF-G, alanyl-tRNA synthetase, RecA, GyrA, GyrB, RpoB and RpoC). All sites in the sequence alignments of these proteins where only two amino acids were found, with each amino acid present in at least two species, were selected. Mutual compatibility determination on these binary state sites was carried out by two means. In one case, all of these sites were combined into a large dataset (Set A; 957 characters) prior to compatibility analysis. In the second case, compatibility analysis was carried out on characters from individual proteins and all compatible sites were combined into a large dataset (Set B; 398 characters) for further studies. Upon compatibility analyses, the largest cliques that were obtained from Sets A and B consisted of 337 and 323 compatible characters, respectively. In these cliques, all proteobacterial subgroups were clearly distinguished and branching orders of most of the species were also resolved. The ε-proteobacteria exhibited the earliest branching, whereas the β- and γ-subgroups were found to have emerged last. The relative placement of the α- and δ-subgroups, however, was not resolved. The topology of these species was also determined based on 16S rRNA sequences and a concatenated dataset of sequences for all 10 proteins by means of neighbor-joining, maximum likelihood, and maximum parsimony methods. In the protein trees, all proteobacterial groups were reliably resolved and they branched in the following order: (ε(δ(α(β,γ)))). However, in the rRNA trees, the γ- and β-subgroups exhibited polyphyletic branching and many internal nodes were not resolved. These results indicate that the character compatibility analysis using generalized molecular sequence data provides a powerful means for evolutionary studies. Based on molecular sequences, it should be possible to obtain very large datasets of compatible characters that should prove very helpful in clarifying difficult to resolve phylogenetic relationships.Keywords
This publication has 69 references indexed in Scilit:
- Molecular signatures (unique proteins and conserved indels) that are specific for the epsilon proteobacteria (Campylobacterales)BMC Genomics, 2006
- Dichotomy of major bacterial phyla inferred from gene arrangement comparisonsJournal of Theoretical Biology, 2006
- A Phylogenomic Approach to Bacterial Phylogeny: Evidence of a Core of Genes Sharing a Common HistoryGenome Research, 2002
- Gene Arrangements and Phylogeny in the Class ProteobacteriaJournal of Theoretical Biology, 2001
- Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoideaJournal of Molecular Evolution, 1989
- Compatibility Methods in SystematicsAnnual Review of Ecology and Systematics, 1985
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981
- A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequencesJournal of Molecular Evolution, 1980
- Criteria for optimising phylogenetic trees and the problem of determining the root of a treeJournal of Molecular Evolution, 1976
- Toward Defining the Course of Evolution: Minimum Change for a Specific Tree TopologySystematic Zoology, 1971