Computer Identification of Constrained Handprinted Characters with a High Recognition Rate

Abstract
Handprinted character recognition by computer is accomplished on forty-nine character classes with a high recognition rate (> 99.4 percent). The form of characters is constrained by requiring each character to be handprinted on a standard grid. The grid is composed of twenty line segments, each of which forms the basis for a feature, yielding twenty features to represent each character. A person printing these characters is not expected to remain precisely on the grid lines. The errors that do occur in following the grid lines are assumed to be normally distributed; therefore, each feature is based on a "longitudinal Gaussian-shaped surface." A page of constrained characters to be recognized is input to the computer using a television camera. Each character on the page is located, isolated from the other characters, and quantized into binary points. A novel technique utilizing a Gaussian point-to-line distance concept for calculation of "feature value" has been employed. The recognition program extracts the twenty feature values and attempts to determine in which of the forty-nine character classes the unknown character belongs. This decision is made based on these twenty features using a weighted minimum distance classifier. If only a marginal classification can be made, a second-level decision is used to increase the likelihood of correct classification. The second-level decision uses the most discriminating features of the two most likely character classes in order to increase the likelihood of correct classification. All character-dependent data are obtained through training techniques.

This publication has 10 references indexed in Scilit: