Decision tree–driven tandem mass spectrometry for shotgun proteomics

Abstract
The two major mechanisms for peptide fragmentation by mass spectrometry, collision-activated dissociation (CAD) or a newer method, electron transfer dissociation (ETD), display different efficacies for different peptide chemistries. A decision tree algorithm, which can be embedded into instruments with both CAD and ETD capabilities, selects the optimal fragmentation method to improve the chances of successful peptide identification. Mass spectrometry has become a key technology for modern large-scale protein sequencing. Tandem mass spectrometry, the process of peptide ion dissociation followed by mass-to-charge ratio (m/z) analysis, is the critical component for peptide identification. Recent advances in mass spectrometry now permit two discrete, and complementary, types of peptide ion fragmentation: collision-activated dissociation (CAD) and electron transfer dissociation (ETD) on a single instrument. To exploit this complementarity and increase sequencing success rates, we designed and embedded a data-dependent decision tree algorithm (DT) to make unsupervised, real-time decisions of which fragmentation method to use based on precursor charge and m/z. Applying the DT to large-scale proteome analyses of Saccharomyces cerevisiae and human embryonic stem cells, we identified 53,055 peptides in total, which was greater than by using CAD (38,293) or ETD (39,507) alone. In addition, the DT method also identified 7,422 phosphopeptides, compared to either 2,801 (CAD) or 5,874 (ETD) phosphopeptides.