Handwritten numeral verification method using distribution maps of structural features

Abstract
Character recognition methods can be categorized into two major approaches. One is pattern matching, which is little affected by topological changes such as breaks in strokes. The other is structural analysis, which tolerates distorted characters only if the topological features of their undistorted versions are kept. We developed a new recognition method for hand-written numerals by combining the merits of the two approaches. The recognition process consists of three steps: (1) an input character is recognized by a patternmatching method, which reduces the number of possible categories to 1.5 on the average, (2) the character is yenfled to be true, false, or uncertain by a structural analysis method that we have newly developed, and (3) special heuristic verification logics are applied to uncertain characters. In the second step, the new structural analysis method uses the positions and directions of terminal points extracted from thinned character images as a main feature. The extracted terminal points are labeled according to a structural-feature distribution map prepared for each category. The generated labels are matched with template label sets constructed by statistical analysis. The characteristics of the method are as follows: (1) it copes with distortion of hand-written characters by using distribution maps for the positions and directions of feature points, and (2) distribution maps can be automatically generated from statistical data in learning samples and easily tuned interactively. The merits of combining the two methods are as follows: (1) the advantages of both pattern matching and structural analysis are obtained, (2) the probabilities of steps 2 and 3 needing to be executed are 22% and 9% respectively, which hardly affect the total processing time, and (3) as a result of steps 1 and 2, only a small number of special logics are required. In a test using unconstrained hand-written characters of low quality, the recognition rate and substitution rate were 95.2% and 0.42% respectively. A recognition speed of 80 characters/second was achieved on a small hardware system.