Handwritten word recognition using continuous density variable duration hidden Markov model

Abstract
A complete system for the recognition of unconstrained handwritten words using a continuous density variable duration hidden Markov model (CDVDHMM) is described. First, a novel segmentation algorithm based on mathematical morphology is developed to translate the 2-D image into a 1-D sequence of subcharacter symbols. This sequence of symbols is modeled by the CDVDHMM. Thirty-five features are selected to represent the character symbols in the feature space. Generally, there are two information sources associated with written text: the shape information and the linguistic knowledge. While the shape information of each character symbol is modeled as a mixture Gaussian distribution, the linguistic knowledge, i.e., constraint, is modeled as a Markov chain. The variable duration state is used to take care of the segmentation ambiguity among the consecutive characters. Detailed experiments were carried out using handwritten city names, and successful recognition results are reported.