Canonical Correlation Analysis of Mid- and Near-Infrared Oil Spectra

Abstract
A mathematical procedure based on Canonical Correlation Analysis (CCA) was used in order to assign the wavelengths of the near-infrared spectra through knowledge of the mid-infrared spectra. The relevance of the treatment was tested on commercial oils that mainly differ in their level of unsaturation. Initially, two separated Principal Component Analyses (PCAs) were performed on the near- and mid-infrared data to overcome the high intercorrelations across the wavelengths. CCA was then applied to the resulting principal components. Near- and mid-infrared canonical variates were assessed so that they achieved maximum correlation. The procedure makes it possible to draw CCA spectral patterns that exhibit significant positive and negative peaks. The first near-infrared canonical variate was highly correlated with the first mid-infrared canonical variate ( r2 = 0.97). The corresponding near- and mid-infrared CCA spectral patterns were therefore given the same interpretation. The mid-infrared pattern opposed negative peaks characteristic of CH2 groups to the positive peaks of CH3 and =CH groups. Consequently, in the near-infrared pattern, the positive peaks at 1708, 2140, 2170, and 2480 nm were assigned to CH3 or =CH groups, and the negative peaks at 2304, 2344, and 2445 nm were assigned to CH2 groups. A more precise interpretation was obtained by comparing the wavelengths observed to theoretical values and to previous assignments.