Genome assembly comparison identifies structural variants in the human genome

Abstract
Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs1,2 and intermediate-sized variants (ISVs)3. However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.