Chemometrics and spectral frequency selection

Abstract
In many fields of science, the simple straight line has received more attention as a basis for calibration than any other form. This is because measuring devices have been mainly univariate and have had calibration curves which were sufficiently linear. As scientific fields become more computationally intensive they rely on more computer-driven multivariate measurement devices. The number of responses may be large. For example modern scanning infrared (IR) spectroscopes measure the absorptions or reflectances at a sequence of around one thousand frequencies. Training data may consist of the order of 10 to 100 carefully designed samples for which the true composition is either known by formulation or accurately determined by wet chemistry. In future one wishes to predict the true composition from the spectrum. In this paper we develop a variable selection approach which is both simple in concept and computationally easy to implement. Its motivation is the minimization of the width of a confidence interval. The technique for data reduction is illustrated on a mid-IR spectroscopic analysis of a liquid detergent in which the calibrating data consists of 12 observations of absorptions at 1168 frequency channels (responses) corresponding to five chemical ingredients.