Statistical analysis and prediction of protein–protein interfaces
- 19 May 2005
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 60 (3), 353-366
- https://doi.org/10.1002/prot.20433
Abstract
Predicting protein–protein interfaces from a three‐dimensional structure is a key task of computational structural proteomics. In contrast to geometrically distinct small molecule binding sites, protein–protein interface are notoriously difficult to predict. We generated a large nonredundant data set of 1494 true protein–protein interfaces using biological symmetry annotation where necessary. The data set was carefully analyzed and a Support Vector Machine was trained on a combination of a new robust evolutionary conservation signal with the local surface properties to predict protein–protein interfaces. Fivefold cross validation verifies the high sensitivity and selectivity of the model. As much as 97% of the predicted patches had an overlap with the true interface patch while only 22% of the surface residues were included in an average predicted patch. The model allowed the identification of potential new interfaces and the correction of mislabeled oligomeric states. Proteins 2005.Keywords
This publication has 58 references indexed in Scilit:
- UniProt: the Universal Protein knowledgebaseNucleic Acids Research, 2004
- Structural Characterisation and Functional Significance of Transient Protein–Protein InteractionsJournal of Molecular Biology, 2003
- The Protein Data BankNucleic Acids Research, 2000
- Prediction of protein-protein interaction sites using patch analysis 1 1Edited by G. von HeijneJournal of Molecular Biology, 1997
- Crystal Structure of Abrin-a at 2.14 ÅJournal of Molecular Biology, 1995
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Shape Complementarity at Protein/Protein InterfacesJournal of Molecular Biology, 1993
- The rapid generation of mutation data matrices from protein sequencesBioinformatics, 1992
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981