A statistical framework for protein quantitation in bottom-up MS-based proteomics
Open Access
- 17 June 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (16), 2028-2034
- https://doi.org/10.1093/bioinformatics/btp362
Abstract
Motivation: Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low quality or incorrectly identified peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level. Results: We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC/MS datasets are used to illustrate the methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives. Availability: The software has been made available in the open-source proteomics platform DAnTE (http://omics.pnl.gov/software/). Contact:adabney@stat.tamu.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 23 references indexed in Scilit:
- A Statistical Model for iTRAQ Data AnalysisJournal of Proteome Research, 2008
- DAnTE: a statistical tool for quantitative analysis of -omics dataBioinformatics, 2008
- Statistical Analysis of Relative Labeled Mass Spectrometry Data from Complex Samples Using ANOVAJournal of Proteome Research, 2008
- Application of Proteomics in the Discovery of Candidate Protein Biomarkers in a Diabetes Autoantibody Standardization Program Sample SubsetJournal of Proteome Research, 2007
- Analysis and validation of proteomic data generated by tandem mass spectrometryNature Methods, 2007
- PRISM: A data management system for high‐throughput proteomicsProteomics, 2006
- Advances in proteomics data analysis and display using an accurate mass and time tag approachMass Spectrometry Reviews, 2006
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- Mass spectrometry-based proteomicsNature, 2003
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002