Developmental biochemistry of cottonseed embryogenesis and germination. XIX. Sequences and genomic organization of the α globulin (vicilin) genes of cottonseed

Abstract
The α globulin storage protein genes of cotton are found to exist as gene tandems that contain a gene from each of the 2 α globulin subfamilies separated by a spacer region of about 2700 or 3400 base pairs. Three different tandems have been identified by restriction endonuclease mapping of genomic DNA. A cDNA that is different from the genes of the tandems in map sites and/or in nucleotide sequence indicates that a fourth tandem probably exists in the cotton genome. Since the species of cotton used here (Gossypium hirsutum) is an amphidiploid, it is likely that two of the tandems are contributed from each genome. Considerable divergence in nucleotide sequence (18%) and in derived amino acid sequence (28%) is found when the 2 genes of a sequenced tandem are compared. The sequence of the cDNA closely resembles one of the genes in the tandem showing only a 4% divergence in nucleotides and a 4.2% divergence in amino acids. Thus the 2 genes of each tandem represent a relatively ancient gene duplication that has given rise to the two α globulin subfamilies of cotton. Only one subfamily has a glycosylation site and the glycosylation of its derived proteins gives rise to the 2 molecular weight sets of α globulins seen on gel electrophoresis. Other basic features of these genes and their derived proteins are presented.