Population Definition, Sample Selection, and Calibration Procedures for Near Infrared Reflectance Spectroscopy

Abstract
Near infrared spectroscopy relies heavily on the collection of an appropriate population of samples for calibration and the best mathematical procedure to obtain the most accurate calibration. The purpose of this study was to evaluate two algorithms (CENTER and SELECT) for defining the population and selecting samples for calibration. The selected samples were used to compare modified partial least squares regression (MPLSR) with modified stepwise regression (MSR) calibration method. The algorithms were developed to (i) establish the boundaries of a population of samples in terms of the standardized Mahalanobis distance (H) from the mean and (ii) select a small, structured set of samples for calibration using the standardized H distance between sample pairs. Two diverse populations of samples were used to test these approaches. Calibrations were performed using MPLSR and MSR. A standardized H distance of 3.0 from the mean was used as a boundary for excluding spectral outliers from a population, and a minimum standardized H distance between samples of 0.6 provided an adequate number of calibration samples for accurate predictions. Both regression methods provided acceptable validation statistics for crude protein, acid detergent fiber, and in vitro dry matter disappearance. The MPLSR calibration method gave an overall 18% improvement in standard error of performance (SEP) compared with the MSR calibration method.