Complementary DNA and derived amino acid sequence of the .alpha. subunit of human complement protein C8: evidence for the existence of a separate .alpha. subunit messenger RNA

Abstract
The entire amino acid sequence of the .alpha. subunit (Mr 64000) of the eighth component of complement (C8) was determined by characterizing cDNA clones isolated from a human liver cDNA library. Two clones with overlapping inserts of net length 2.44 kilobases (kb) were isolated and found to contain the entire .alpha. coding region [1659 base pairs (bp)]. The 5'' end consists of an untranslated region and a leader sequence of 30 amino acids. This sequence contains an apparent initiation Met, signal peptide, and propeptide which ends with an arginine-rich sequence that is characteristic of proteolytic processing sites found in the pro form of protein precursors. The 3'' untranslated region contains two polyadenylation signals and a poly(A) sequence. RNA blot analysis of total cellular RNA from the human hepatoma cell line HepG2 revealed a message size of .apprx.2.5 kb. Features of the 5'' and 3'' sequences and the message size suggest that a separate mRNA codes for .alpha. and argues against the occurrence of a single-chain precursor form of the disulfide-linked .alpha.-.gamma. subunit found in mature C8. Analysis of the derived amino acid sequence revealed several membrane surface seeking domains and a possible transmembrane domain. These occur in a cysteine-free region of the subunit and may constitute the structural basis for .alpha. interaction with target membranes. Analysis of the carbohydrate composition indicates 1 or 2 asparagine-linked but no O-linked oligosaccharide chains, a result consistent with predictions from the amino acid sequence. The .alpha. subunit contains segments homologous to the negatively charged, cysteine-rich repeat sequence found in low-density lipoprotein receptor and to the cysteine-rich epidermal growth factor type sequence found in a number of proteins. Most significantly, it exhibits a striking overall homology to human C9, with values of 24% on the basis of identity and 46% when conserved substitutions are allowed. As described in an accompanying report [Howard, O. M. Z., Rao, A. G., and Sodetz, J. M. (1987) Biochemistry (following paper in this issue)], this homology also extends to the .beta. subunit of C8.