The functional importance of disease-associated mutation

Abstract
For many years, scientists believed that point mutations in genes are the genetic switches for somatic and inherited diseases such as cystic fibrosis, phenylketonuria and cancer. Some of these mutations likely alter a protein's function in a manner that is deleterious, and they should occur in functionally important regions of the protein products of genes. Here we show that disease-associated mutations occur in regions of genes that are conserved, and can identify likely disease-causing mutations. To show this, we have determined conservation patterns for 6185 non-synonymous and heritable disease-associated mutations in 231 genes. We define a parameter, the conservation ratio, as the ratio of average negative entropy of analyzable positions with reported mutations to that of every analyzable position in the gene sequence. We found that 84.0% of the 231 genes have conservation ratios less than one. 139 genes had eleven or more analyzable mutations and 88.0% of those had conservation ratios less than one. These results indicate that phylogenetic information is a powerful tool for the study of disease-associated mutations. Our alignments and analysis has been made available as part of the database at http://cancer.stanford.edu/mut-paper/. Within this dataset, each position is annotated with the analysis, so the most likely disease-causing mutations can be identified.