Formant estimation for speech recognition

Abstract
This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and the segment boundaries that optimally match the spectrum. We used this method in experimental tests that were carried out on the TI digit string data base. The main results of the experimental tests are: (1) the presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers; and (2) the estimated formant frequencies were used in a number of variants for recognition. The best set-up resulted in a string error rate of 4.2% on the adult corpus of the TI digit string data base.

This publication has 19 references indexed in Scilit: