A statistical approach to learning and generalization in layered neural networks
- 1 October 1990
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in Proceedings of the IEEE
- Vol. 78 (10), 1568-1574
- https://doi.org/10.1109/5.58339
Abstract
A general statistical description of the problem of learning from examples is presented. Learning in layered networks is posed as a search in the network parameter space for a network that minimizes an additive error function of a statistically independent examples. By imposing the equivalence of the minimum error and the maximum likelihood criteria for training the network, the Gibbs distribution on the ensemble of networks with a fixed architecture is derived. The probability of correct prediction of a novel example can be expressed using the ensemble, serving as a measure to the network's generalization ability. The entropy of the prediction distribution is shown to be a consistent measure of the network's performance. The proposed formalism is applied to the problems of selecting an optimal architecture and the prediction of learning curves.<>Keywords
This publication has 17 references indexed in Scilit:
- Exhaustive LearningNeural Computation, 1990
- Learning from Examples in a Single-Layer Neural NetworkEurophysics Letters, 1990
- On the capabilities of multilayer perceptronsJournal of Complexity, 1988
- The space of interactions in neural network modelsJournal of Physics A: General Physics, 1988
- Learning Networks of Neurons with Boolean LogicEurophysics Letters, 1987
- A theory of the learnableCommunications of the ACM, 1984
- Alternative approach to maximum-entropy inferencePhysical Review A, 1984
- Optimization by Simulated AnnealingScience, 1983
- Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern RecognitionIEEE Transactions on Electronic Computers, 1965
- Probability, Frequency and Reasonable ExpectationAmerican Journal of Physics, 1946