Statistical mechanics of learning from examples

1 April 1992

journal article
research article
Published by American Physical Society (APS) in Physical Review A

Vol. 45 (8), 6056-6091
https://doi.org/10.1103/physreva.45.6056

Abstract

Learning from examples in feedforward neural networks is studied within a statistical-mechanical framework. Training is assumed to be stochastic, leading to a Gibbs distribution of networks characterized by a temperature parameter T. Learning of realizable rules as well as of unrealizable rules is considered. In the latter case, the target rule cannot be perfectly realized by a network of the given architecture. Two useful approximate theories of learning from examples are studied: the high-temperature limit and the annealed approximation. Exact treatment of the quenched disorder generated by the random sampling of the examples leads to the use of the replica theory. Of primary interest is the generalization curve, namely, the average generalization error

ε_{g}

versus the number of examples P used for training. The theory implies that, for a reduction in

ε_{g}

that remains finite in the large-N limit, P should generally scale as αN, where N is the number of independently adjustable weights in the network. We show that for smooth networks, i.e., those with continuously varying weights and smooth transfer functions, the generalization curve asymptotically obeys an inverse power law. In contrast, for nonsmooth networks other behaviors can appear, depending on the nature of the nonlinearities as well as the realizability of the rule. In particular, a discontinuous learning transition from a state of poor to a state of perfect generalization can occur in nonsmooth networks learning realizable rules.

Keywords

This publication has 46 references indexed in Scilit:

Finite-size effects and bounds for perceptron models
Journal of Physics A: General Physics, 1991
The Vapnik-Chervonenkis Dimension: Information versus Complexity in Learning
Neural Computation, 1989
Function of identified interneurons in the leech elucidated using neural networks trained by back-propagation
Nature, 1989
What Size Net Gives Valid Generalization?
Neural Computation, 1989
Predicting the secondary structure of globular proteins using neural network models
Journal of Molecular Biology, 1988
Optimal storage properties of neural network models
Journal of Physics A: General Physics, 1988
Exhaustive Thermodynamical Analysis of Boolean Learning Networks
Europhysics Letters, 1987
Spin glasses: Experimental facts, theoretical concepts, and open questions
Reviews of Modern Physics, 1986
Optimization by Simulated Annealing
Science, 1983
Infinite-ranged models of spin-glasses
Physical Review B, 1978

Cited by 367 articles