Abstract
A linear reinforcement learning technique is proposed to provide a memory and thus accelerate the convergence of successive approximation algorithms. The learning scheme is used to update weighting coefficients applied to the components of the correction terms of the algorithm. A direction of the search approaching the direction of a "ridge" will result in a gradient peak-seeking method which accelerates considerably the convergence to a neighborhood of the extremum. In a stochastic approximation algorithm the learning scheme provides the required memory to establish a consistent direction or search insensitive to perturbations introduced by the random variables involved. The accelerated algorithms and the respective proofs of convergence are presented. Illustrative examples demonstrate the validity of the proposed algorithms.

This publication has 2 references indexed in Scilit: