Abstract
Efficient supervised learning on large redundant training sets requires algorithms where the amount of computation involved in preparing each weight update is independent of the training set size. Off-line algorithms like the standard conjugate gradient algorithms do not have this property while on-line algorithms like the stochastic backpropagation algorithm do. A new algorithm combining the good properties of off-line and on-line algorithms is introduced.