Statistical Analysis of Pathogenicity of Somatic Mutations in Cancer
- 1 August 2006
- journal article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 173 (4), 2187-2198
- https://doi.org/10.1534/genetics.105.044677
Abstract
Recent large-scale sequencing studies have revealed that cancer genomes contain variable numbers of somatic point mutations distributed across many genes. These somatic mutations most likely include passenger mutations that are not cancer causing and pathogenic driver mutations in cancer genes. Establishing a significant presence of driver mutations in such data sets is of biological interest. Whereas current techniques from phylogeny are applicable to large data sets composed of singly mutated samples, recently exemplified with a p53 mutation database, methods for smaller data sets containing individual samples with multiple mutations need to be developed. By constructing distinct models of both the mutation process and selection pressure upon the cancer samples, exact statistical tests to examine this problem are devised. Tests to examine the significance of selection toward missense, nonsense, and splice site mutations are derived, along with tests assessing variation in selection between functional domains. Maximum-likelihood methods facilitate parameter estimation, including levels of selection pressure and minimum numbers of pathogenic mutations. These methods are illustrated with 25 breast cancers screened across the coding sequences of 518 kinase genes, revealing 90 base substitutions in 71 genes. Significant selection pressure upon truncating mutations was established. Furthermore, an estimated minimum of 29.8 mutations were pathogenic.Keywords
This publication has 16 references indexed in Scilit:
- RB1 gene mutation up-date, a meta-analysis based on 932 reported mutations available in a searchable databaseBMC Genomic Data, 2005
- Adaptive audio watermarking based on SNR in localized regionsJournal of Zhejiang University-SCIENCE A, 2005
- CONREAL web server: identification and visualization of conserved transcription factor binding sitesNucleic Acids Research, 2005
- A screen of the complete protein kinase gene family identifies diverse patterns of somatic mutations in human breast cancerNature Genetics, 2005
- Ensembl 2005Nucleic Acids Research, 2004
- A census of human cancer genesNature Reviews Cancer, 2004
- A stochastic carcinogenesis model incorporating genomic instability fitted to colon cancer dataMathematical Biosciences, 2003
- The UMD-p53 database: New mutations and analysis toolsHuman Mutation, 2003
- Mutations of the BRAF gene in human cancerNature, 2002
- A codon-based model of nucleotide substitution for protein-coding DNA sequences.Molecular Biology and Evolution, 1994