Raman and NIR Spectroscopic Methods for Determination of Total Dietary Fiber in Cereal Foods: A Comparative Study

Abstract
Partial least-squares regression (PLSR) was used to generate three Raman and three near-infrared reflectance (NIRR) models for the spectroscopic determination of total dietary fiber (TDF) of a wide variety of cereal foods. To allow comparison of the spectral techniques, both analyses used the same sets of samples ( ncal = 63, nval = 63). Six models were optimized by full leave-one-out cross-validation (CV), including a smoothed, a first-, and a second-derivative model for each spectral technique. Both kinds of raw spectral data required correction of interfering baseline and amplitude variations. Derivative preprocessing generally reduced the number of latent variables (LVs) for both spectral types and significantly reduced the CV error of the Raman models. The derivative treatments enhanced the influence of select vibrational-bandwidth-sized features in the Raman data (64–84 cm−1). The Savitzky–Golay derivative calculation method was better for the NIRR data, while the gap-difference method was better for the Raman data, which had a higher level of baseline noise. Raman models required 6 to 9 latent variables while NIRR models required 10 to 14 LVs. The root-mean-squared CV model errors were 2–2.3% TDF for all six models, and the three Raman models had root-mean-squared prediction errors (RMSEPs) in the range 2.8–3.2% TDF, with the best model being generated from second-derivative data. First-derivative data provided the best NIRR model, and for all three NIRR models the RMSEP spanned 2.4–2.9%. For some types of samples, it is suggested that the Raman method is limited by its sampling technique and could be improved with more densely packed, larger-area specimens. The regression vectors of the Raman models seem more easily interpretable than NIRR models. Either spectral method appears capable of acheiving an acceptable level of error; TDF reference method precision was 0.68% TDF, while the product label information had an error of 2.8% TDF relative to the reference.