Cascaded multiple classifiers for secondary structure prediction

Abstract
We describe a new classifier for protein secondary structure prediction that is formed by cascading together different types of classifiers using neural networks and linear discrimination. The new classifier achieves an accuracy of 76.7% (assessed by a rigorous full Jack‐knife procedure) on a new nonredundant dataset of 496 nonhomologous sequences (obtained from G.J. Barton and JA. Cuff). This database was especially designed to train and test protein secondary structure prediction methods, and it uses a more stringent definition of homologous sequence than in previous studies. We show that it is possible to design classifiers that can highly discriminate the three classes (H, E, C) with an accuracy of up to 78% for β‐strands, using only a local window and resampling techniques. This indicates that the importance of long‐range interactions for the prediction of β‐strands has been probably previously overestimated.