Evidence for nonrandom hydrophobicity structures in protein chains.

3 September 1996

journal article
Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences

Vol. 93 (18), 9533-9538
https://doi.org/10.1073/pnas.93.18.9533

Abstract

The question of whether proteins originate from random sequences of amino acids is addressed. A statistical analysis is performed in terms of blocked and random walk values formed by binary hydrophobic assignments of the amino acids along the protein chains. Theoretical expectations of these variables from random distributions of hydrophobicities are compared with those obtained from functional proteins. The results, which are based upon proteins in the SWISS-PROT data base, convincingly show that the amino acid sequences in proteins differ from what is expected from random sequences in a statistically significant way. By performing Fourier transforms on the random walks, one obtains additional evidence for nonrandomness of the distributions. We have also analyzed results from a synthetic model containing only two amino acid types, hydrophobic and hydrophilic. With reasonable criteria on good folding properties in terms of thermodynamical and kinetic behavior, sequences that fold well are isolated. Performing the same statistical analysis on the sequences that fold well indicates similar deviations from randomness as for the functional proteins. The deviations from randomness can be interpreted as originating from anticorrelations in terms of an Ising spin model for the hydrophobicities. Our results, which differ from some previous investigations using other methods, might have impact on how permissive with respect to sequence specificity protein folding process is-only sequences with nonrandom hydrophobicity distributions fold well. Other distributions give rise to energy landscapes with poor folding properties and hence did not survive the evolution.

Keywords

All Related Versions

This publication has 11 references indexed in Scilit:

Nonrandomness in protein sequences: evidence for a physically driven stage of evolution?
Proceedings of the National Academy of Sciences, 1994
The SWISS-PROT protein sequence data bank: current status.
1994
Enlarged representative set of protein structures
Protein Science, 1994
Kinetics of Protein Folding: A Lattice Model Study of the Requirements for Folding to the Native State
Journal of Molecular Biology, 1994
Toy model for protein folding
Physical Review E, 1993
Optimal neural networks for protein-structure prediction
Physical Review E, 1993
Selection of representative protein data sets
Protein Science, 1992
Statistical distribution of hydrophobic residues along the length of protein chains. Implications for protein folding and evolution
Biophysical Journal, 1990
The hydrophobic moment detects periodicity in protein hydrophobicity.
Proceedings of the National Academy of Sciences, 1984
The protein data bank: A computer-based archival file for macromolecular structures
Journal of Molecular Biology, 1977

Cited by 77 articles