Abstract
A class of stochastic automaton models for the synthesis of a learning system to operate in a random environment is proposed. These models are based on defining a learning algorithm which relates the probability distribution of the response and the corresponding performance of the system. For different forms of the learning algorithm which satisfy specified requirements, with particular emphasis on a linear algorithm, the following desired learning behavior is shown to hold. 1) The mean performance converges monotonically to an extreme value, and 2) a criterion is available for determining the best response in the time limit. The learning models provide the desired learning behavior in an on-line manner while requiring little a priori knowledge and/or assumptions concerning the environment. Some applications of the learning models to engineering systems are considered.