Peptide‐mass fingerprinting and the ideal covering set for protein characterisation
- 1 January 1997
- journal article
- research article
- Published by Wiley in Electrophoresis
- Vol. 18 (8), 1399-1409
- https://doi.org/10.1002/elps.1150180815
Abstract
The rules that govern the dynamics of protein characterisation by peptide‐mass fingerprinting (PMF) were investigated through multiple interrogations of a nonredundant protein database. This was achieved by analysing the efficiency of identifying each entry in the entire database via perfect in silico digestion with a series of 20 pseudo‐endoproteinases cutting at the carboxy terminal of each amino acid residue, and the multiple cutters: trypsin, chymotrypsin and Glu‐C. The distribution of peptide fragment masses generated by endoproteinase digestion was examined with a view to designing better approaches to protein characterisation by PMF. On average, and for both common and rare cutters, the combination of approximately two fragments was sufficient to identify most database entries. However, the rare cutters left more entries unidentified in the database. Total coverage of the entire database could not be achieved with one enzymatic cutter alone, nor when all 23 cutters were used together. Peptide fragments of > 5000 Da had little effect on the outcome of PMF to correctly characterise database entries, while those with low mass (near to 350 Da in the case of trypsin) were found to be of most utility. The most frequently occurring fragments were also found in this lower mass region. The maximum size of uncut database entries (those not containing a specific amino acid residue) ranged from 52 908 Da to 258 314 Da, while the failure rate for a single cutter in identifying database entries varied from 10 865 (8.4%) to 23 290 (18.1 %). PMF is likely to be a mainstay of any high‐throughput protein screening strategy for large‐scale proteome analysis. A better understanding of the merits and limitations of this technique will allow researchers to optimise their protein characterisation procedures.Keywords
This publication has 72 references indexed in Scilit:
- Life with 6000 GenesScience, 1996
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding Regions (Supplement)DNA Research, 1996
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding RegionsDNA Research, 1996
- Matrix Dependence of Metastable Fragmentation of Glycoproteins in MALDI TOF Mass SpectrometryAnalytical Chemistry, 1995
- Tryptic mapping of human chorionic gonadotropin by matrix‐assisted laser desorption/ionization mass spectrometryRapid Communications in Mass Spectrometry, 1995
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994
- Rapid identification of proteins by peptide-mass fingerprintingCurrent Biology, 1993
- Primary structure determination of peptides and enzymically digested proteins using capillary liquid chromatography/mass spectrometry and rapid linked-scan techniquesAnalytical Chemistry, 1991
- FAB-MAPPING of recombinant-DNA protein productsBiochemical and Biophysical Research Communications, 1983
- Abnormal human haemoglobinsBiochimica et Biophysica Acta, 1958