Automated inference of molecular mechanisms of disease from amino acid substitutions
Open Access
- 3 September 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (21), 2744-2750
- https://doi.org/10.1093/bioinformatics/btp528
Abstract
Motivation: Advances in high-throughput genotyping and next generation sequencing have generated a vast amount of human genetic variation data. Single nucleotide substitutions within protein coding regions are of particular importance owing to their potential to give rise to amino acid substitutions that affect protein structure and function which may ultimately lead to a disease state. Over the last decade, a number of computational methods have been developed to predict whether such amino acid substitutions result in an altered phenotype. Although these methods are useful in practice, and accurate for their intended purpose, they are not well suited for providing probabilistic estimates of the underlying disease mechanism. Results: We have developed a new computational model, MutPred, that is based upon protein sequence, and which models changes of structural features and functional sites between wild-type and mutant sequences. These changes, expressed as probabilities of gain or loss of structure and function, can provide insight into the specific molecular mechanism responsible for the disease state. MutPred also builds on the established SIFT method but offers improved classification accuracy with respect to human disease mutations. Given conservative thresholds on the predicted disruption of molecular function, we propose that MutPred can generate accurate and reliable hypotheses on the molecular basis of disease for ∼11% of known inherited disease-causing mutations. We also note that the proportion of changes of functionally relevant residues in the sets of cancer-associated somatic mutations is higher than for the inherited lesions in the Human Gene Mutation Database which are instead predicted to be characterized by disruptions of protein structure. Availability:http://mutdb.org/mutpred Contact:predrag@indiana.edu; smooney@buckinstitute.orgKeywords
This publication has 58 references indexed in Scilit:
- Identification, analysis, and prediction of protein ubiquitination sitesProteins-Structure Function and Bioinformatics, 2009
- Next generation tools for the annotation of human SNPsBriefings in Bioinformatics, 2009
- Gain and loss of phosphorylation sites in human cancerBioinformatics, 2008
- CanPredict: a computational tool for predicting cancer-associated missense mutationsNucleic Acids Research, 2007
- SNAP: predict effect of non-synonymous polymorphisms on functionNucleic Acids Research, 2007
- Patterns of somatic mutation in human cancer genomesNature, 2007
- A census of human cancer genesNature Reviews Cancer, 2004
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomesJournal of Molecular Biology, 2001
- Crystal Structure of the PTEN Tumor SuppressorCell, 1999