Abstract
Many regulatory peptide precursors undergo post-translational processing at mono- and/or dibasic residues. Comparison of amino acids around the monobasic cleavage sites suggests that these cleavages follow certain sequence motifs and can be described as the rules that govern monobasic cleavages: (i) a basic amino acid it present at either 3, 5, or 7 amino acids N-terminal to the cleavage site, (ii) hydrophobic aliphatic amino acids (leucine, isoleucine, valine, or methionine) are never present in the position C-terminal to the monobasic amino acid at the cleavage site, (iii) a cysteine is never present in the vicinity of the cleavage site, and (iv) an aromatic amino acid is never present at the position N-terminal to the monobasic amino acid at the cleavage site. In addition to these rules, the monobasic cleavages follow certain tendencies: (i) the amino acid at the cleavage site tends to be predominantly arginine, (ii) the amino acid at the position C-terminal to the cleavage site tends to be serine, alanine or glycine in more than 60% of the cases, (iii) the amino acid at either 3, 5, or 7 position N-terminal to the cleavage site tends to be arginine, (iv) aromatic amino acids are rare at the position C-terminal to the monobasic amino acid at the cleavage site, and (v) aliphatic amino acids tend to be in the two positions N-terminal to and the two positions C-terminal to the cleavage site, except as noted above. When compared with a large number of sequence containing single basic amino acids, these rules and tendencies are capable of not only correctly predicting the processing sites, but also are capable of excluding most of the single basic sequences that are known to be uncleaved. Many or these rules can also be applied to correctly predict the dibasic and multibasic cleavage sites suggesting that the rules and tendencies could govern endoproteolytic processing at the monobasic, dibasic and multibasic sites