Characterization of the complementary deoxyribonucleic acid and gene coding for human prothrombin

Abstract
The DNA sequences of a complementary DNA (cDNA) and a portion of the gene coding for human prothrombin were determined. The cDNA was 2005 base pairs in length and was found to code for part of a leader sequence of 36 amino acids, 579 amino acids present in the mature protein, a stop codon, a noncoding region of 97 base pairs and a poly(A) tail of 27 base pairs. It is proposed that the leader sequence consists of a signal sequence and a pro sequence for the mature protein that circulates in plasma. The 10 glutamic acid residues that are present in the amino-terminal region of prothrombin and are converted to .gamma.-carboxyglutamic acid in the mature protein are coded by only the GAG codon. The cDNA for prothrombin was also employed as a probe for screening a human fetal liver genomic DNA library. One of the strongly positive phage containing a human DNA insert of 5 kilobases was mapped with restriction endonucleases and sequenced. This DNA contained approximately half of the gene for human prothrombin and included 6 introns and 5 exons coding for amino acid residues 144-448. The 2 largest intervening sequences in the genomic DNA contained 2 copies each of AluI repetitive DNA.