Determination of the Processing Sites of an Arabidopsis 2S Albumin and Characterization of the Complete Gene Family

Abstract
The most abundant isoform of the 2S albumin present in seeds of Arabidopsis thaliana has been sequenced and the corresponding gene isolated. Examination of the protein and DNA sequences allows the determination of the exact proteolytic cleavage sites during posttranslational processing. Like other 2S albumins, that of Arabidopsis is made as a prepropeptide. After removal of the signal peptide, the propeptide is cleaved at four other points, giving two subunits linked by a disulfide bridge(s). Comparison of these cleavage sites with those of 2S albumins of Brassica napus and Bertholletia excelsa suggests that while individual cleavage sites between species are conserved, the four processing sites within a species are not similar, suggesting that up to four different proteases may be invovled in processing 2S albumins. The Arabidopsis 2S albumin gene was used to isolate the entire gene family. There are four genes, tightly linked in a tandem array. None of the genes contains an intron. Comparison of the predicted protein sequences shows that only one of the genes can encode the isoform determined by protein analysis to be the most abundant, and therefore this gene is certain to be expressed. It is possible that some or all of the other three genes are also active.