Dynamic formant tracking of noisy speech using temporal analysis on outputs from a nonlinear cochlear model

1 May 1993

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Biomedical Engineering

Vol. 40 (5), 456-467
https://doi.org/10.1109/10.243416

Abstract

In this paper we take a modeling approach to studying representation of formant frequencies of spoken speech and speech in noise in the temporal responses of the peripheral auditory system. On the basis of the properties of the representation, we have devised and evaluated a cross-channel correlation algorithm and an interpeak interval analysis for automatic formant extraction of speech which is strongly dynamic in acoustic characteristics and is embedded in noise. The basilar membrane model used in this study contains laterally coupled damping elements, which are made monotonically dependent on the spatial distribution of the short-term power in the outputs of the model. Efficient digital implementation and the related salient numerical properties of the model are described. Simulation results from the model in response to speech and speech in noise illustrate temporal response patterns that are tonotopically organized in relation to speech formant parameters with little influence by the noise level. By utilizing such relations the devised cross-channel correlation algorithm is shown to be capable of accurately tracking formant movements in spoken syllables and sentences.

Keywords

This publication has 20 references indexed in Scilit:

Speech analysis and recognition using interval statistics generated from a composite auditory model
IEEE Transactions on Speech and Audio Processing, 1998
Time-domain analysis of auditory-nerve-fiber firing rates
The Journal of the Acoustical Society of America, 1990
The representation of the spectra and fundamental frequencies of steady-state single- and double-vowel sounds in the temporal discharge patterns of guinea pig cochlear-nerve fibers
The Journal of the Acoustical Society of America, 1990
Responses of ‘‘high-spontaneous’’ auditory-nerve fibers to consonant–vowel syllables in noise
The Journal of the Acoustical Society of America, 1989
Responses of auditory-nerve fibers to multiple-tone complexes
The Journal of the Acoustical Society of America, 1987
Frequency selectivity of single cochlear‐nerve fibers based on the temporal response pattern to two‐tone signals
The Journal of the Acoustical Society of America, 1986
Coding of spectral fine structure in the auditory nerve. I. Fourier analysis of period and interspike interval histograms
The Journal of the Acoustical Society of America, 1986
Speech processing in the auditory system II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerve
The Journal of the Acoustical Society of America, 1985
Speech coding in the auditory nerve: V. Vowels in background noise
The Journal of the Acoustical Society of America, 1984
Speech coding in the auditory nerve: IV. Sounds with consonant-like dynamic characteristics
The Journal of the Acoustical Society of America, 1984

Cited by 12 articles