Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing
Open Access
- 18 December 2011
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Biotechnology
- Vol. 30 (1), 61-68
- https://doi.org/10.1038/nbt.2053
Abstract
Distinguishing single-nucleotide variants (SNVs) from errors in whole-genome sequences remains challenging. Here we describe a set of filters, together with a freely accessible software tool, that selectively reduce error rates and thereby facilitate variant detection in data from two short-read sequencing technologies, Complete Genomics and Illumina. By sequencing the nearly identical genomes from monozygotic twins and considering shared SNVs as 'true variants' and discordant SNVs as 'errors', we optimized thresholds for 12 individual filters and assessed which of the 1,048 filter combinations were effective in terms of sensitivity and specificity. Cumulative application of all effective filters reduced the error rate by 290-fold, facilitating the identification of genetic differences between monozygotic twins. We also applied an adapted, less stringent set of filters to reliably identify somatic mutations in a highly rearranged tumor and to identify variants in the NA19240 HapMap genome relative to a reference set of SNVs.Keywords
This publication has 49 references indexed in Scilit:
- Haplotype-resolved genome sequencing of a Gujarati Indian individualNature Biotechnology, 2011
- L1 retrotransposition in neurons is modulated by MeCP2Nature, 2010
- A map of human genome variation from population-scale sequencingNature, 2010
- Integrating common and rare genetic variation in diverse human populationsNature, 2010
- Clinical assessment incorporating a personal genomeThe Lancet, 2010
- Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosisNature, 2010
- Ancient human genome sequence of an extinct Palaeo-EskimoNature, 2010
- Systematic sequencing of renal carcinoma reveals inactivation of histone modifying genesNature, 2010
- A comprehensive catalogue of somatic mutations from a human cancer genomeNature, 2009
- A small-cell lung cancer genome with complex signatures of tobacco exposureNature, 2009