Error Correlation and Error Reduction in Ensemble Classifiers

Abstract
Using an ensemble of classifiers, instead of a single classifier, can lead to improved generalization. The gains obtained by combining, however, are often affected more by the selection of what is presented to the combiner than by the actual combining method that is chosen. In this paper, we focus on data selection and classifier training methods, in order to 'prepare' classifiers for combining. We review a combining framework for classification problems that quantifies the need for reducing the correlation among individual classifiers. Then, we discuss several methods that make the classifiers in an ensemble more complementary. Experimental results are provided to illustrate the benefits and pitfalls of reducing the correlation among classifiers, especially when the training data are in limited supply.