Phylogenetic Approaches to the Identification and Characterization of Protein Families and Superfamilies

1 January 1996

journal article
review article
Published by Mary Ann Liebert Inc in Genome Science and Technology

Vol. 1 (3), 129-150
https://doi.org/10.1089/mcg.1996.1.129

Abstract

With the advent of megabase genome sequencing, the need for computational analyses increases exponentially. Sequencing errors must be corrected, encoded proteins must be identified, functions must be assigned to these proteins, and distant phylogenetic relationships must be recognized in order to maximize the yield of information obtainable from genome sequencing projects. Both the computer and the human brain have their limitations, but using them in combination, the biologist can vastly extend his or her analytic capabilities. Computer techniques can be used to estimate protein structure, function, biogenesis, and evolution. In this review, the application of available computer programs to several protein families, particularly transport, receptor, and transcriptional regulatory protein families, illustrate our current capabilities and limitations. Although some multidomain protein families are evolutionarily homogeneous, others have mosaic origins. Evidence concerning the nature and frequency of occurrence of domain shuffling, splicing, fusion, deletion, and duplication during evolution of specific protein families is evaluated. It is shown that specific families of enzymes, receptors, transport proteins, and transcriptional regulatory proteins share a common evolutionary origin, frequently diverging in function because of domain splicing and ligation. Some large families arose gradually over evolutionary time, whereas others developed suddenly, due to bursts of intragenic or intergenic (or both) duplication events occurring over relatively short periods of time. It is argued that energy coupling to transport was a late occurrence, superimposed on preexisting mechanisms of solute facilitation. It is also shown that several.transport protein families have evolved independently of each other, employing different routes, at different times in evolutionary history, to give topologically similar transmembrane protein complexes.

Keywords

This publication has 78 references indexed in Scilit:

Phylogenetic, structural and functional analyses of the LacI‐GalR family of bacterial transcription factors
FEBS Letters, 1995
Novel phosphotransferase system genes revealed by bacterial genome analysis - a gene cluster encoding a unique Enzyme I and the proteins of a fructose-like permease system
Microbiology, 1995
Evolutionary relationships between sugar kinases and transcriptional repressors in bacteria
Microbiology, 1994
A functional superfamily of sodium/solute symporters
Biochimica et Biophysica Acta (BBA) - Reviews on Biomembranes, 1994
Refined 1.89-.ANG. Structure of the Histidine-Binding Protein Complexed with Histidine and Its Relationship with Many Other Active Transport/Chemosensory Proteins
Biochemistry, 1994
Refined 1.8-.ANG. structure reveals the mode of binding of .beta.-cyclodextrin to the maltodextrin binding protein
Biochemistry, 1993
The nik operon of Escherichia coli encodes a periplasmic binding‐protein‐dependent transport system for nickel
Molecular Microbiology, 1993
Membrane protein structure prediction
Journal of Molecular Biology, 1992
Nucleotide sequence of bacteriophage λ DNA
Journal of Molecular Biology, 1982
A simple method for displaying the hydropathic character of a protein
Journal of Molecular Biology, 1982

Cited by 17 articles