Systematic assessment of copy number variant detection via genome-wide SNP genotyping

Abstract
Evan Eichler and colleagues present an analysis of how well current commercial SNP platforms accurately capture copy number variants (CNVs). Although they were able accurately predict from Illumina Human 1M genotype data many sites identified in their recent study assessing CNVs in nine human individuals with a fosmid paired-end sequence approach, they find that commonly used platforms offer limited coverage for a large fraction of CNVs. SNP genotyping has emerged as a technology to incorporate copy number variants (CNVs) into genetic analyses of human traits. However, the extent to which SNP platforms accurately capture CNVs remains unclear. Using independent, sequence-based CNV maps, we find that commonly used SNP platforms have limited or no probe coverage for a large fraction of CNVs. Despite this, in 9 samples we inferred 368 CNVs using Illumina SNP genotyping data and experimentally validated over two-thirds of these. We also developed a method (SNP-Conditional Mixture Modeling, SCIMM) to robustly genotype deletions using as few as two SNP probes. We find that HapMap SNPs are strongly correlated with 82% of common deletions, but the newest SNP platforms effectively tag about 50%. We conclude that currently available genome-wide SNP assays can capture CNVs accurately, but improvements in array designs, particularly in duplicated sequences, are necessary to facilitate more comprehensive analyses of genomic variation.