Estimation of prediction error for multivariate calibration

Abstract
When arrays of non‐selective sensors or overlapping spectra are used for chemical analysis, multivariate calibration must be used to relate the instrument responses to individual analytes. Using a set of carefully selected calibration samples, a multivariate mathematical model is constructed for one or more analytes. If this step is successful, the model can be used to predict the concentrations of these analytes in prospective samples. Previously, the equations required to estimate the errors in the predicted concentrations, and from these the confidence intervals, were not available because the three sources of error (measured responses from calibration data, concentrations of the analytes in the calibration set and measured responses from the unknown sample) propagated in a non‐linear manner not amenable to statistical analysis. A new theory for error propagation is developed. The theory developed herein does not require estimates of the actual three sources of errors mentioned above and therefore is easy to implement. Data from near‐infrared reflectance spectrometry of wheat samples were used to test the equations derived from the theory. Complete agreement between the true prediction errors and those estimated from the theory is demonstrated.