Non-linear spectral subtraction (NSS) and hidden Markov models for robust speech recognition in car noise environments

Abstract
The authors address the problem of speaker-dependent discrete utterance recognition in noise. Special reference is made to the mismatch effects due to the fact that training and testing are carried out in different environments. The authors extend their previous work (Lockwood and Boudy, 1991) where a robust hidden Markov model (HMM) training/recognition framework is proposed. Several new aspects are introduced: use of enhanced nonlinear spectral subtraction (NSS) schemes, introduction of root-MFCC parameters, use of dynamic features, and training of HMMs by a dynamic inference scheme (DIHMM). These enhancements are discussed from tests performed on bandlimited signals (200-3000 Hz). The authors show that these various optimizations allow a rise from 20% to over 99% in performance. A 93% recognition rate is already achievable on raw data using a weighted modified projection and a root-MFCC dynamic representation.

This publication has 12 references indexed in Scilit: