Cloning of cDNAs coding for the heavy chain region and connecting region of human factor V, a blood coagulation factor with four types of internal repeats

Abstract
Human factor V is a high molecular weight plasma glycoprotein that participates as a cofactor in the conversion of prothrombin to thrombin by factor Xa. Prior to its participation in the coagulation cascade, factor V is converted to factor Va by thrombin generating a heavy chain and a light chain, and these two chains are held together by calcium ions. A connecting region originally located between the heavy and light chains is liberated during the activation reaction. In a previous study, a cDNA of 2970 nucleotides that codes for the carboxyl-terminal 938 amino acids of factor V was isolated and characterized from a Hep G2 cDNA library [Kane, W. H. and Davie, E. W. (1986) Proc. Natl. Acad. Sci. U.S.A. 83, 6800-6804]. This cDNA has been used to obtain additional clones from Hep G2 and human liver cDNA libraries. Furthermore, a Hep G2 cDNA library prepared with an oligoonculeotide from the 5'' end of these cDNAs was screened to obtain overlapping cDNA clones that code for the amino-terminal region of the molecule. The composite sequence of these clones spans 6911 nucleotides and is consistent with the size of the factor V message present in Hep G2 cells (approximately 7 kilobases). The cDNA codes for a leader sequence of 28 amino acids and a mature protein of 2196 amino acids. The amino acid sequence predicted from the cDNA was in complete agreement with 139 amino acid residues that were identified by Edman degradation of cyanogen bromide peptides isolated from the heavy chain region and connecting region of plasma factor V. The domain structure of human factor V is similar to that previously reported for human coagulation factor VIII. Both proteins contain an amino-terminal heavy chain region, a connecting region, and a carboxyl-terminal light chain region. Also, each protein contains two A domains present in the heavy chain region and one A domain located in the light chain region. These A domains are .apprx. 30% identical with the triplicated A domains of ceruloplasmin, a copper binding protein present in plasma. Factor V and factor VIII also contain a large connecting region rich in carbohydrate, but there is no significant amino acid sequence homology between the two proteins in this portion of the two molecules. Two types of tandem repeats (17 and 9 amino acids) have also been identified in the connecting region of factor V. The light chain regions of factor V and factor VIII also contain two C domains that follow the third A domain. The present data indicate that the amino acid sequence in the heavy and light chain regions of factor V is .apprx. 40% identical with the corresponding regions of factor VIII. These experiments provide additional support for the concept that these two proteins as well as ceruloplasmin share a common ancestry during their evolution.