Functional Principal Component Regression and Functional Partial Least Squares

Abstract
Regression of a scalar response on signal predictors, such as near-infrared (NIR) spectra of chemical samples, presents a major challenge when, as is typically the case, the dimension of the signals far exceeds their number. Most solutions to this problem reduce the dimension of the predictors either by regressing on components [e.g., principal component regression (PCR) and partial least squares (PLS)] or by smoothing methods, which restrict the coefficient function to the span of a spline basis. This article introduces functional versions of PCR and PLS, which combine both of the foregoing dimension-reduction approaches. Two versions of functional PCR are developed, both using B-splines and roughness penalties. The regularized-components version applies such a penalty to the construction of the principal components (i.e., it uses functional principal components), whereas the regularized-regression version incorporates a penalty in the regression. For the latter form of functional PCR, the penalty parameter may be selected by generalized cross-validation, restricted maximum likelihood (REML), or a minimum mean integrated squared error criterion. Proceeding similarly, we develop two versions of functional PLS. Asymptotic convergence properties of regularized-regression functional PCR are demonstrated. A simulation study and split-sample validation with several NIR spectroscopy data sets indicate that functional PCR and functional PLS, especially the regularized-regression versions with REML, offer advantages over existing methods in terms of both estimation of the coefficient function and prediction of future observations.