Detection and Direct Genomic Sequencing of Multiple Rare Unknown Flanking DNA in Highly Complex Samples

Abstract
By identifying the sequence of retro- and lentiviral integration sites in peripheral blood leukocytes, the clonal composition and fate of genetically modified hematopoietic progenitor and stem cells could be mapped in vitro and in vivo. Previously available methods have been limited to the analysis of mono- or oligoclonal integration sites present in high copy numbers. Here, we perform characterization of multiple rare retroviral and lentiviral integration sites in highly complex DNA samples. The reliability of this method results from nontarget DNA removal via magnetic extension primer tag selection (EPTS) preceding solid-phase ligation-mediated PCR. EPTS/LM-PCR allowed the simultaneous direct genomic sequencing of multiple proviral LTR-flanking sequences of retro- and lentiviral vectors even if only 1 per 100 to 1000 cells contained the provirus. A primer walking "around" the integration locus demonstrated the adaptability of EPTS/LM-PCR to study unknown flanking DNA regions unrelated to proviruses. The technique is fast, inexpensive, and sensitive in minimal samples. It enables studies of retro- and lentiviral integration, viral vector tracking in gene therapy, insertional mutagenesis, transgene integration, and direct genomic sequencing that until now have been difficult or impossible to perform.