Identification of deleterious mutations within three human genomes
Top Cited Papers
Open Access
- 14 July 2009
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 19 (9), 1553-1561
- https://doi.org/10.1101/gr.092619.109
Abstract
Each human carries a large number of deleterious mutations. Together, these mutations make a significant contribution to human disease. Identification of deleterious mutations within individual genome sequences could substantially impact an individual's health through personalized prevention and treatment of disease. Yet, distinguishing deleterious mutations from the massive number of nonfunctional variants that occur within a single genome is a considerable challenge. Using a comparative genomics data set of 32 vertebrate species we show that a likelihood ratio test (LRT) can accurately identify a subset of deleterious mutations that disrupt highly conserved amino acids within protein-coding sequences, which are likely to be unconditionally deleterious. The LRT is also able to identify known human disease alleles and performs as well as two commonly used heuristic methods, SIFT and PolyPhen. Application of the LRT to three human genomes reveals 796–837 deleterious mutations per individual, ∼40% of which are estimated to be at <5% allele frequency. However, the overlap between predictions made by the LRT, SIFT, and PolyPhen, is low; 76% of predictions are unique to one of the three methods, and only 5% of predictions are shared across all three methods. Our results indicate that only a small subset of deleterious mutations can be reliably identified, but that this subset provides the raw material for personalized medicine.Keywords
This publication has 52 references indexed in Scilit:
- EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebratesGenome Research, 2008
- The diploid genome sequence of an Asian individualNature, 2008
- The complete genome of an individual by massively parallel DNA sequencingNature, 2008
- Quality scores and SNP detection in sequencing-by-synthesis systemsGenome Research, 2008
- A second generation human haplotype map of over 3.1 million SNPsNature, 2007
- SNAP: predict effect of non-synonymous polymorphisms on functionNucleic Acids Research, 2007
- Most Rare Missense Alleles Are Deleterious in Humans: Implications for Complex Disease and Association StudiesAmerican Journal of Human Genetics, 2007
- Medical Sequencing at the Extremes of Human Body MassAmerican Journal of Human Genetics, 2007
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- Dobzhansky–Muller incompatibilities in protein evolutionProceedings of the National Academy of Sciences, 2002