Binary patterning of polar and nonpolar amino acids in the sequences and structures of native proteins
- 1 October 1995
- journal article
- research article
- Published by Wiley in Protein Science
- Vol. 4 (10), 2032-2039
- https://doi.org/10.1002/pro.5560041008
Abstract
Protein sequences can be represented as binary patterns of polar (○) and nonpolar (•) amino acids. These binary sequence patterns are categorized into two classes: Class A patterns match the structural repeat of an idealized amphiphilic α‐helix (3.6 residues per turn), and class B patterns match the structural repeat of an idealized amphiphilic β‐strand (2 residues per turn). The difference between these two classes of sequence patterns has led to a strategy for de novo protein design based on binary patterning of polar and nonpolar amino acids. Here we ask whether similar binary patterning is incorporated in the sequences and structures of natural proteins. Analysis of the Protein Data Bank demonstrates the following. (1) Class A sequence patterns occur considerably more frequently in the sequences of natural proteins than would be expected at random, but class B patterns occur less often than expected. (2) Each pattern is found predominantly in the secondary structure expected from the binary strategy for protein design. Thus, class A patterns are found more frequently in α‐helices than in β‐strands, and class B patterns are found more frequently in β‐strands than in α‐helices. (3) Among the α‐helices of natural proteins, the most commonly used binary patterns are indeed the class A patterns. (4) Among all β‐strands in the database, the most commonly used binary patterns are not the expected class B patterns. (5) However, for solvent‐exposed β‐strands, the correlation is striking: All β‐strands in the database that contain the class B patterns are exposed to solvent. (6) The bias of class A patterns for α‐structure over β‐structure and the bias of class B patterns for β‐structure over α‐structure are significant, not merely when compared to other binary patterns of polar (○) and nonpolar (•) amino acids, but also when compared to the full range of sequences in the database. The implications for the design of novel proteins are discussed.Keywords
This publication has 18 references indexed in Scilit:
- Protein superfamilles and domain superfoldsNature, 1994
- Protein Design by Binary Patterning of Polar and Nonpolar Amino AcidsScience, 1993
- MOLSCRIPT: a program to produce both detailed and schematic plots of protein structuresJournal of Applied Crystallography, 1991
- De Novo Design, Expression, and Characterization of Felix: a Four-Helix Bundle Protein of Native-Like SequenceScience, 1990
- Dominant forces in protein foldingBiochemistry, 1990
- Analysis of sequence-similar pentapeptides in unrelated protein tertiary structuresJournal of Molecular Biology, 1987
- Induction of peptide conformation at apolar water interfaces. 1. A study with model peptides of defined hydrophobic periodicityJournal of the American Chemical Society, 1985
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- Use of Helical Wheels to Represent the Structures of Proteins and to Identify Segments with Helical PotentialBiophysical Journal, 1967