Selective cloning and sequence analysis of the human L1 (LINE-1) sequences which transposed in the relatively recent past

Abstract
L1 (LINE-1), a long interspersed repetitive DNA family of mammalian genomes, is thought to be a sequence family derived from a retrotransposon-like element(s), but its actively transposable unit(s) has not been identified yet. We developed a novel method for selective isolation of the human L1 sequences which transposed in a relatively recent past and may have still retained a feature of the ''active L1'' unit. From the inspection of the nucleotide sequences, we conjectured that the ''active L1'' or ''nearly active L1'' units should have a high content of the CpG dinucleotide sequence, a mutation hot spot sequence, and contain several sites for rate cutters such as BssH II and Nar I at their 5'' terminal regions. Using these rare cutter sites as selection markers, the L1 sequences were isolated, which had the high content of CpG at the 5'' terminal regions and over 90% homology to L1 transcripts found in a human teratocarcinoma cell line. These L1s were shown to be ''relatively new L1'' units which had integrated into chromosomes within these several million years during evolution. From the sequence data of these L1s and L1 cDNA, a consensus sequence of the 5'' terminal region of high CpG L1s were constructed. A region of the consensus sequence showed about 69% homology to the 5'' terminal region of Drosophila jockey element.