Analysis and prediction of affinity of TAP binding peptides using cascade SVM

Abstract
The generation of cytotoxic T lymphocyte (CTL) epitopes from an antigenic sequence involves number of intracellular processes, including production of peptide fragments by proteasome and transport of peptides to endoplasmic reticulum through transporter associated with antigen processing (TAP). In this study, 409 peptides that bind to human TAP transporter with varying affinity were analyzed to explore the selectivity and specificity of TAP transporter. The abundance of each amino acid from P1 to P9 positions in high‐, intermediate‐, and low‐affinity TAP binders were examined. The rules for predicting TAP binding regions in an antigenic sequence were derived from the above analysis. The quantitative matrix was generated on the basis of contribution of each position and residue in binding affinity. The correlation of r = 0.65 was obtained between experimentally determined and predicted binding affinity by using a quantitative matrix. Further a support vector machine (SVM)‐based method has been developed to model the TAP binding affinity of peptides. The correlation (r = 0.80) was obtained between the predicted and experimental measured values by using sequence‐based SVM. The reliability of prediction was further improved by cascade SVM that uses features of amino acids along with sequence. An extremely good correlation (r = 0.88) was obtained between measured and predicted values, when the cascade SVM‐based method was evaluated through jackknife testing. A Web service, TAPPred (http://www.imtech.res.in/raghava/tappred/ or http://bioinformatics.uams.edu/mirror/tappred/), has been developed based on this approach.