Complete nucleotide sequence of a French bean storage protein gene: Phaseolin

Abstract
The complete nucleotide sequences of the gene and the mRNA coding for a specific phaseolin type French bean major storage protein were determined. Comparison of these sequences reveals a phaseolin gene structure consisting of 80 base pairs (bp) of 5'' untranslated DNA, 1,263 bp of protein-encoding DNA which is interrupted by 5 intervening sequences (IVS1, 72 bp; IVS2, 88 bp; IVS3, 124 bp; IVS4, 128 bp; and IVS5, 103 bp), and 135 bp of 3'' untranslated DNA. Sequences characteristic of eukaryotic promoters CCAAT and TATA are present in the 5'' flanking DNA, and the eukaryotic poly(A) addition signal A-A-T-A-A-A occurs 16 bp before the 1st nucleotide of poly(A). The derived amino acid sequence yields an amino acid composition and a MW compatible with those found for the .beta.-type phaseolin protein. Two regions that probably serve as carbohydrate-peptide linkage recognition sites have been identified. A region of highly hydrophobic amino acids at the NH2 terminus of the protein suggests the presence of a signal peptide in the newly synthesized phaseolin protein.