Dynamic formant tracking of noisy speech using temporal analysis on outputs from a nonlinear cochlear model
- 1 May 1993
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Biomedical Engineering
- Vol. 40 (5), 456-467
- https://doi.org/10.1109/10.243416
Abstract
In this paper we take a modeling approach to studying representation of formant frequencies of spoken speech and speech in noise in the temporal responses of the peripheral auditory system. On the basis of the properties of the representation, we have devised and evaluated a cross-channel correlation algorithm and an interpeak interval analysis for automatic formant extraction of speech which is strongly dynamic in acoustic characteristics and is embedded in noise. The basilar membrane model used in this study contains laterally coupled damping elements, which are made monotonically dependent on the spatial distribution of the short-term power in the outputs of the model. Efficient digital implementation and the related salient numerical properties of the model are described. Simulation results from the model in response to speech and speech in noise illustrate temporal response patterns that are tonotopically organized in relation to speech formant parameters with little influence by the noise level. By utilizing such relations the devised cross-channel correlation algorithm is shown to be capable of accurately tracking formant movements in spoken syllables and sentences.Keywords
This publication has 20 references indexed in Scilit:
- Speech analysis and recognition using interval statistics generated from a composite auditory modelIEEE Transactions on Speech and Audio Processing, 1998
- Time-domain analysis of auditory-nerve-fiber firing ratesThe Journal of the Acoustical Society of America, 1990
- The representation of the spectra and fundamental frequencies of steady-state single- and double-vowel sounds in the temporal discharge patterns of guinea pig cochlear-nerve fibersThe Journal of the Acoustical Society of America, 1990
- Responses of ‘‘high-spontaneous’’ auditory-nerve fibers to consonant–vowel syllables in noiseThe Journal of the Acoustical Society of America, 1989
- Responses of auditory-nerve fibers to multiple-tone complexesThe Journal of the Acoustical Society of America, 1987
- Frequency selectivity of single cochlear‐nerve fibers based on the temporal response pattern to two‐tone signalsThe Journal of the Acoustical Society of America, 1986
- Coding of spectral fine structure in the auditory nerve. I. Fourier analysis of period and interspike interval histogramsThe Journal of the Acoustical Society of America, 1986
- Speech processing in the auditory system II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerveThe Journal of the Acoustical Society of America, 1985
- Speech coding in the auditory nerve: V. Vowels in background noiseThe Journal of the Acoustical Society of America, 1984
- Speech coding in the auditory nerve: IV. Sounds with consonant-like dynamic characteristicsThe Journal of the Acoustical Society of America, 1984