Abstract
The search for amino acid sequence homologies can be a powerful tool for predicting protein structure. Discovered sequence homologies are currently used in predicting the function of oncogene proteins. To sharpen this tool, the structural significance of short sequence homologies was investigated by searching proteins of known 3-dimensional structure for subsequence identities. In 62 proteins with 10,000 residues, the longest isolated homologies between unrelated proteins are 5 residues long. In 6 (out of 25) cases there was surprising structural adaptability: the same 5 residues are part of an .alpha.-helix in one protein and part of a .beta.-strand in another protein. These examples show quantitatively that pentapeptide structure within a protein is strongly dependent on sequence context, a fact essentially ignored in most protein structure prediction methods: just considering the local sequence of 5 residues is not sufficient to predict correctly the local conformation (secondary structure). Cooperativity of length 6 or longer must be taken into account. Also, in the growing practice of comparing a new protein sequence with a data base of known sequences, finding an identical pentapeptide sequence between 2 proteins is not a significant indication of structural similarity or of evolutionary kinship.