Large-Scale Structure-Based Prediction of Stable Peptide Binding to Class I HLAs Using Random Forests
Open Access
- 22 July 2020
- journal article
- research article
- Published by Frontiers Media SA in Frontiers in Immunology
- Vol. 11, 1583
- https://doi.org/10.3389/fimmu.2020.01583
Abstract
Prediction of stable peptide binding to Class I HLAs is an important component for designing immunotherapies. While the best performing predictors are based on machine learning algorithms trained on peptide-HLA (pHLA) sequences, the use of structure for training predictors deserves further exploration. Given enough pHLA structures, a predictor based on the residue-residue interactions found in these structures has the potential to generalize for alleles with little or no experimental data. We have previously developed APE-Gen, a modeling approach able to produce pHLA structures in a scalable manner. In this work we use APE-Gen to model over 150,000 pHLA structures, the largest dataset of its kind, which were used to train a structure-based pan-allele model. We extract simple, homogenous features based on residue-residue distances between peptide and HLA, and build a random forest model for predicting stable pHLA binding. Our model achieves competitive AUROC values on leave-one-allele-out validation tests using significantly less data when compared to popular sequence-based methods. Additionally, our model offers an interpretation analysis that can reveal how the model composes the features to arrive at any given prediction. This interpretation analysis can be used to check if the model is in line with chemical intuition, and we showcase particular examples. Our work is a significant step toward using structure to achieve generalizable and more interpretable prediction for stable pHLA binding.Keywords
Funding Information
- U.S. National Library of Medicine
- Cancer Prevention and Research Institute of Texas
- National Science Foundation (CHE-1740990, CHE-1900374, PHY-1427654)
- Welch Foundation
This publication has 40 references indexed in Scilit:
- NetMHCstab – predicting stability of peptide–MHC‐I complexes; impacts for cytotoxic T lymphocyte epitope discoveryImmunology, 2013
- The Peptide-Receptive Transition State of MHC Class I Molecules: Insight from Structure and Molecular DynamicsThe Journal of Immunology, 2012
- Peptide‐MHC class I stability is a better predictor than peptide affinity of CTL immunogenicityEuropean Journal of Immunology, 2012
- PeptX: Using Genetic Algorithms to optimize peptides for MHC bindingBMC Bioinformatics, 2011
- Large-scale characterization of peptide-MHC binding landscapes with structural simulationsProceedings of the National Academy of Sciences of the United States of America, 2011
- pDOCK: a new technique for rapid and accurate docking of peptide ligands to Major Histocompatibility ComplexesImmunome Research, 2010
- HLA class I supertypes: a revised and updated classificationBMC Immunology, 2008
- The MHC class I antigen presentation pathway: strategies for viral immune evasionImmunology, 2003
- Reliable prediction of T‐cell epitopes using neural networks with novel sequence representationsProtein Science, 2003
- Customized versus universal scoring functions: application to class I MHC–peptide binding free energy predictionsBioorganic & Medicinal Chemistry Letters, 2001