Voiced-unvoiced-silence detection using the Itakura LPC distance measure
- 24 March 2005
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Absfraa One of the most difficult problems in speech analysis is reliable discrimination among silence, unvoiced speech, and voiced speech which has been transmitted over a telephone line. Although several methods have been proposed for making this 3-level deci- sion, these schemes have met with only modest success. In this paper a novel approach to the voiced-unvoiced-silence detection problem is proposed in which a spectral characterization of each of the 3 classes of signal is obtained during a training session, and an LPC distance metric and an energy distance are nonlinearly com- bined to make the final discrimination. This algorithm has been tested over conventional switched telephone lines, across a variety of speakers, and has been found to have an error rate of about 5%, with the majority of the errors (about 2/3) occurring at the boun- daries between signal classes. The algorithm is currently being used in a speaker independent word recognition system.Keywords
This publication has 12 references indexed in Scilit:
- A speaker-independent, syntax-directed, connected word recognition system based on hidden Markov models and level buildingIEEE Transactions on Acoustics, Speech, and Signal Processing, 1985
- Evaluation of a Statistical Approach to Voiced-Unvoiced-Silence Analysis for Telephone-Quality SpeechBell System Technical Journal, 1977
- Speech encryption by manipulations of LPC parametersThe Journal of the Acoustical Society of America, 1976
- A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1976
- Real-time digital hardware pitch detectorIEEE Transactions on Acoustics, Speech, and Signal Processing, 1976
- Applications of a nonlinear smoothing algorithm to speech processingIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975
- Minimum prediction residual principle applied to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975
- The SIFT algorithm for fundamental frequency estimationIEEE Transactions on Audio and Electroacoustics, 1972
- System for Automatic Formant Analysis of Voiced SpeechThe Journal of the Acoustical Society of America, 1970
- Cepstrum Pitch DeterminationThe Journal of the Acoustical Society of America, 1967