AUTOMATIC INTERACTION DETECTION

Abstract
INTRODUCTION Automatic Interaction Detection (AID) is a family of methods for handling regression-type data in a way that is almost free of the usual assumptions necessary to process the data using linear hypothesis methods. In AID, one has a dependent variable Y which one wishes to predict, and a vector of predictors X from which to predict Y. The predictors are all categorical (i.e. either nominal or ordinal), and generally take on only a few possible values. Interval predictors may be reduced to this form by grouping their possible values into classes, and then using the (ordinal) class variable as the predictor. Various different methods within the AID family have been devised for situations in which the dependent variable Y is: (a) a scalar interval variable, (b) a scalar nominal variable, (c) a vector of interval variables. Other possibilities such as an ordinal Y or a vector of nominal Y are easy to fit into the general conceptual framework of AID. The name AID suggests that the function of the technique is to discover whether the linear hypothesis model predicting Y from X contains only main effects, or whether interactions also occur. This is indeed one of the things that AID can do, but it has a number of other uses as well, which we consider overshadow this use in importance. Before going into a detailed study of the aims and methods of AID, it may help to consider a simple example.