LEARNING TO RECOGNIZE PATTERNS WITHOUT A TEACHER

Abstract
The techniques of decision theory are applied to the problem of constructing machines that improve their ability to recognize patterns by extracting pertinent information from a previously unclassified sequence of observations; such machines are said to learn without a teacher. A general system solution is obtained which includes the solutions to the problems of learning without a teacher, learning with a teacher, and no learning. The solution has been extended to include problems in which the unknown parameter is time varying, as well as problems in which the probabilities of occurrence of the classes are unknown a priori and must be learned. The resulting systems are shown to be stable and to have performance which converges to the performance of systems which have a priori knowledge of the unknown parameters being learned. It has been demonstrated that for most cases either the optimum system, or a suboptimum system which performs within an arbitrarily smal tolerance of the optimum system, is realizable in the sense that it requires a finite memory.