Predicting leucine zipper structures from sequence

Abstract
The leucine zipper structure is adopted by one family of the coiled coil proteins. Leucine zippers have a characteristic leucine repeat: Leu–X6–Leu–X6–Leu–X6–Leu (where X may be any residue). However, many sequences have the leucine repeat, but do not adopt the leucine zipper structure (we shall refer to these as non-zippers). We have found and analyzed residue pair patterns that allow one to identify correctly 90% of leucine zippers and 97% of non-zippers. Simpler analyses, based on the frequency of occurrence of residues at certain positions, specify, at most, 65% of zippers and 80–90% of non-zippers. Both short and long patterns contribute to the successful discrimination of leucine zippers from non-zippers. A number of these patterns involve hydrophobic residues that would be placed on the solvent-exposed surface of the helix, were the sequence to adopt a leucine zipper structure. Thus, an analysis of protein sequences has allowed us to improve discrimination between leucine zippers and non-zippers, and has provided some further insight into the physical factors influencing the leucine zipper structure.