Statistical mechanics of learning from examples
- 1 April 1992
- journal article
- research article
- Published by American Physical Society (APS) in Physical Review A
- Vol. 45 (8), 6056-6091
- https://doi.org/10.1103/physreva.45.6056
Abstract
Learning from examples in feedforward neural networks is studied within a statistical-mechanical framework. Training is assumed to be stochastic, leading to a Gibbs distribution of networks characterized by a temperature parameter T. Learning of realizable rules as well as of unrealizable rules is considered. In the latter case, the target rule cannot be perfectly realized by a network of the given architecture. Two useful approximate theories of learning from examples are studied: the high-temperature limit and the annealed approximation. Exact treatment of the quenched disorder generated by the random sampling of the examples leads to the use of the replica theory. Of primary interest is the generalization curve, namely, the average generalization error versus the number of examples P used for training. The theory implies that, for a reduction in that remains finite in the large-N limit, P should generally scale as αN, where N is the number of independently adjustable weights in the network. We show that for smooth networks, i.e., those with continuously varying weights and smooth transfer functions, the generalization curve asymptotically obeys an inverse power law. In contrast, for nonsmooth networks other behaviors can appear, depending on the nature of the nonlinearities as well as the realizability of the rule. In particular, a discontinuous learning transition from a state of poor to a state of perfect generalization can occur in nonsmooth networks learning realizable rules.
Keywords
This publication has 46 references indexed in Scilit:
- Finite-size effects and bounds for perceptron modelsJournal of Physics A: General Physics, 1991
- The Vapnik-Chervonenkis Dimension: Information versus Complexity in LearningNeural Computation, 1989
- Function of identified interneurons in the leech elucidated using neural networks trained by back-propagationNature, 1989
- What Size Net Gives Valid Generalization?Neural Computation, 1989
- Predicting the secondary structure of globular proteins using neural network modelsJournal of Molecular Biology, 1988
- Optimal storage properties of neural network modelsJournal of Physics A: General Physics, 1988
- Exhaustive Thermodynamical Analysis of Boolean Learning NetworksEurophysics Letters, 1987
- Spin glasses: Experimental facts, theoretical concepts, and open questionsReviews of Modern Physics, 1986
- Optimization by Simulated AnnealingScience, 1983
- Infinite-ranged models of spin-glassesPhysical Review B, 1978