Application of the Character Compatibility Approach to Generalized Molecular Sequence Data: Branching Order of the Proteobacterial Subdivisions

Open Access

9 December 2006

journal article
Published by Springer Nature in Journal of Molecular Evolution

Vol. 64 (1), 90-100
https://doi.org/10.1007/s00239-006-0082-2

Abstract

The character compatibility approach, which removes all homoplasic characters and involves finding the largest clique of compatible characters in a dataset, in principle, provides a powerful means for obtaining correct topology in difficult to resolve cases. However, the usefulness of this approach to generalized molecular sequence data for phylogeny determination has not been studied in the past. We have used this approach to determine the topology of 23 proteobacterial species (6 each of α-, β- and γ-, 3 δ-, and 2 ε-proteobacteria) using sequence data for 10 conserved proteins (Hsp60, Hsp70, EF-Tu, EF-G, alanyl-tRNA synthetase, RecA, GyrA, GyrB, RpoB and RpoC). All sites in the sequence alignments of these proteins where only two amino acids were found, with each amino acid present in at least two species, were selected. Mutual compatibility determination on these binary state sites was carried out by two means. In one case, all of these sites were combined into a large dataset (Set A; 957 characters) prior to compatibility analysis. In the second case, compatibility analysis was carried out on characters from individual proteins and all compatible sites were combined into a large dataset (Set B; 398 characters) for further studies. Upon compatibility analyses, the largest cliques that were obtained from Sets A and B consisted of 337 and 323 compatible characters, respectively. In these cliques, all proteobacterial subgroups were clearly distinguished and branching orders of most of the species were also resolved. The ε-proteobacteria exhibited the earliest branching, whereas the β- and γ-subgroups were found to have emerged last. The relative placement of the α- and δ-subgroups, however, was not resolved. The topology of these species was also determined based on 16S rRNA sequences and a concatenated dataset of sequences for all 10 proteins by means of neighbor-joining, maximum likelihood, and maximum parsimony methods. In the protein trees, all proteobacterial groups were reliably resolved and they branched in the following order: (ε(δ(α(β,γ)))). However, in the rRNA trees, the γ- and β-subgroups exhibited polyphyletic branching and many internal nodes were not resolved. These results indicate that the character compatibility analysis using generalized molecular sequence data provides a powerful means for evolutionary studies. Based on molecular sequences, it should be possible to obtain very large datasets of compatible characters that should prove very helpful in clarifying difficult to resolve phylogenetic relationships.

Keywords

This publication has 69 references indexed in Scilit:

Molecular signatures (unique proteins and conserved indels) that are specific for the epsilon proteobacteria (Campylobacterales)
BMC Genomics, 2006
Dichotomy of major bacterial phyla inferred from gene arrangement comparisons
Journal of Theoretical Biology, 2006
A Phylogenomic Approach to Bacterial Phylogeny: Evidence of a Core of Genes Sharing a Common History
Genome Research, 2002
Gene Arrangements and Phylogeny in the Class Proteobacteria
Journal of Theoretical Biology, 2001
Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea
Journal of Molecular Evolution, 1989
Compatibility Methods in Systematics
Annual Review of Ecology and Systematics, 1985
Evolutionary trees from DNA sequences: A maximum likelihood approach
Journal of Molecular Evolution, 1981
A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences
Journal of Molecular Evolution, 1980
Criteria for optimising phylogenetic trees and the problem of determining the root of a tree
Journal of Molecular Evolution, 1976
Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology
Systematic Zoology, 1971

Cited by 29 articles