Infrared Spectral Search for Mixtures in Large-Size Libraries

Abstract
A routine for searching large spectral libraries with spectra of mixtures is presented. The dimensionality of a 3169-compound library is reduced to 12% of its original size by using Fourier transform compression and principal component analysis. A principal component regression is performed and used as a prefilter in selecting spectra having features (and chemical groups) similar to those of the unknown mixture. A dot-product metric is then used to identify a target component from the subgroup formed by the prefilter. This is followed by the application of an adaptive filter to remove the similarity of the target component from the subgroup and from the unknown mixture; the search is repeated on the modified data. Successive applications of the adaptive filter will produce minimum residuals if the correct identifications are made. Once the residuals are minimized, a similarity index is calculated to determine the closeness of the unknown mixture spectrum to a spectrum reconstructed from the library spectra. Four out of five two- and three-component spectra were correctly identified. One of the two components in the fifth mixture was correctly identified, and the residual values flagged the improper identification of the second component. After the adaptive filter was applied to the entire library, the second component was correctly identified. Results for this new algorithm are compared to those from four more traditional search routines, which were only completely successful on one of the unknown mixtures.