Predicting Binding Affinities of Protein Ligands from Three-Dimensional Models: Application to Peptide Binding to Class I Major Histocompatibility Proteins

Abstract
A simple and fast free energy scoring function (Fresno) has been developed to predict the binding free energy of peptides to class I major histocompatibility (MHC) proteins. It differs from existing scoring functions mainly by the explicit treatment of ligand desolvation and of unfavorable protein−ligand contacts. Thus, it may be particularly useful in predicting binding affinities from three-dimensional models of protein−ligand complexes. The Fresno function was independently calibrated for two different training sets: (a) five HLA-A*0201-peptide structures, which had been determined by X-ray crystallography, and (b) three-dimensional models of 37 H-2Kk-peptide structures, which had been obtained by knowledge-based homology modeling. For both training sets, a good cross-validated fit to experimental binding free energies was obtained with predictive errors of 3−3.5 kJ/mol. As expected, lipophilic interactions were found to contribute the most to HLA-A*0201-peptide interactions, whereas H-bonding predominates in H-2Kk recognition. Both cross-validated models were afterward used to predict the binding affinity of a test set of 26 peptides to HLA-A*0204 (an HLA allele closely related to HLA-A*0201) and of a series of 16 peptides to H-2Kk. Predictions were more accurate for HLA-A2-binding peptides as the training set had been built from experimentally determined structures. The average error in predicting the binding free energy of the test peptides was 3.1 kJ/mol. For the homology model-derived equation, the average error in predicting the binding free energy of peptides to Kk was significantly higher (5.4 kJ/mol) but still very acceptable. The present scoring function is thus able to predict with a good accuracy binding free energies from three-dimensional models, at the condition that the backbone coordinates of the MHC-bound peptide have first been determined with an accuracy of about 1−1.5 Å. Furthermore, it may be easily recalibrated for any protein−ligand complex.