Cross-validation of component models: A critical look at current methods

24 January 2008

journal article
review article
Published by Springer Science and Business Media LLC in Analytical and Bioanalytical Chemistry

Vol. 390 (5), 1241-1251
https://doi.org/10.1007/s00216-007-1790-1

Abstract

In regression, cross-validation is an effective and popular approach that is used to decide, for example, the number of underlying features, and to estimate the average prediction error. The basic principle of cross-validation is to leave out part of the data, build a model, and then predict the left-out samples. While such an approach can also be envisioned for component models such as principal component analysis (PCA), most current implementations do not comply with the essential requirement that the predictions should be independent of the entity being predicted. Further, these methods have not been properly reviewed in the literature. In this paper, we review the most commonly used generic PCA cross-validation schemes and assess how well they work in various scenarios.

Keywords

This publication has 12 references indexed in Scilit:

Cross-validation of multiway component models
Journal of Chemometrics, 1999
Weighted least squares fitting using ordinary least squares algorithms
Psychometrika, 1997
Selection of optimal regression models via cross‐validation
Journal of Chemometrics, 1988
Cross-validatory choice in principal component analysis; some sampling results
Journal of Statistical Computation and Simulation, 1983
Cross-Validatory Choice of the Number of Components From a Principal Component Analysis
Technometrics, 1982
Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models
Technometrics, 1978
Pattern recognition by means of disjoint principal components models
Pattern Recognition, 1976
A predictive approach to the random effect model
Biometrika, 1974
The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction
Technometrics, 1974
I. Problems and Designs of Cross-Validation 1
Educational and Psychological Measurement, 1951

Cited by 262 articles