Nucleotide sequence of yeast gene CPA1 encoding the small subunit of arginine‐pathway carbamoyl‐phosphate synthetase

Abstract
A yeast DNA fragment carrying the gene CPA1 encoding the small subunit of the arginine pathway carbamoylphosphate synthetase has been sequenced. Only one continuous coding sequence on this fragment was long enough to account for the presumed molecular mass of CPA1 protein product. It codes for a polypeptide of 411 amino acids having a relative molecular mass, Mr, of 45358 and showing extensive homology with the product of carA, the homologou Escherichia coli gene. CPA1 and carA products are glutamine amidotransferases which bind glutamine and transfer its amide group to the large subunits where it is used for the synthesis of carbamoyl-phosphate. A comparison of the amino acid sequences of CPA1 polypeptide with the glutamine amidotransferase domains of anthranilate and p-aminobenzoate synthetases from various sources has revealed the presence in each of these sequences of three highly conserved regions of 8, 11 and 6 amino acids respectively. The 11-residue oligopeptide contains a cysteine which is considered as the active-site residue involved in the binding of glutamine. The distances (number of amino acid residues) which separate these homology regions are accurately conserved in these various enzymes. These observations provide support for the hypothesis that these synthetases have arisen by the combination of a common ancestral glutamine amidotransferase subunit with distinct ammonia-dependent synthetases. Little homology was detected with the amide transfer domain of glutamine phosphoribosyldiphosphate amidotransferase which may be the result of a convergent evolutionary process. The flanking regions of gene CPA1 have been sequenced, 803 base pairs being determined on the 5′ side and 382 on the 3′ side. Several features of the 5′-upstream region of CPA1 potentially related to the control of its expression have been noticed including the presence of two copies of the consensus sequence d(T-G-A-C-T-C) previously identified in several genes subject to the general control of amino acid biosynthesis.