Structure of the gene coding for the alpha polypeptide chain of the human complement component C4b-binding protein.

Abstract
The human gene coding for the 70-kD polypeptide of the complement regulatory component C4b-binding protein (C4BP alpha) spans over 40 kb of DNA and is composed of twelve exons. Upon transcription in liver, or in Hep-G2 cells, this gene produces a single transcript of 2,262 nucleotides, excepting the poly A tail, that presents an unusually long 5' untranslated region (5' UTR) of 223 nucleotides. The C4BP alpha gene is organized as follows: the first exon codes for the first 198 nucleotides of the 5' UTR. It is separated by a large intron from the second exon including the remaining of the 5' UTR and the coding region for the signal peptide. Each of the eight 60-amino acid repeats (short consensus repeats [SCRs]) that compose the C4BP alpha polypeptide chain is encoded by a single exon, except for the second SCR, which is split in two exons. At the 3' end of the C4BP alpha gene, the twelfth exon codes for the COOH-terminal 57 amino acids of the mature protein, which have no similarities to the SCRs, and the 245 nucleotides of the 3' UTR. Examination of the nucleotide sequence of the first exon revealed an interesting characteristic, strongly suggesting that this exon may specify a functional domain of the C4BP alpha transcript. It includes two in-phase ATG codons, in a different frame respect to that coding the C4BP alpha polypeptide, followed by an in-frame termination codon, also within the first exon. Comparison between mouse and human C4BP alpha transcripts indicates conservation of this structure within the 5' UTR. C4BP is expressed in the liver and is an acute phase protein. A computer search of the genomic sequences upstream the transcription start site demonstrates the presence of potential cis-acting regulatory elements similar to those found in the promoters of other liver-expressed and/or acute phase genes.

This publication has 35 references indexed in Scilit: