Measuring and modeling vocal source-tract interaction

1 July 1994

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Biomedical Engineering

Vol. 41 (7), 663-671
https://doi.org/10.1109/10.301733

Abstract

The quality of synthetic speech is affected by two factors: intelligibility and naturalness. At present, synthesized speech may be highly intelligible, but often sounds unnatural. Speech intelligibility depends on the synthesizer's ability to reproduce the formants, the formant bandwidths, and formant transitions, whereas speech naturalness is thought to depend on the excitation waveform characteristics for voiced and unvoiced sounds. Voiced sounds may be generated by a quasiperiodic train of glottal pulses of specified shape exciting the vocal tract filter. It is generally assumed that the glottal source and the vocal tract filter are linearly separable and do not interact. However, this assumption is often not valid, since it has been observed that appreciable source-tract interaction can occur in natural speech. Previous experiments in speech synthesis have demonstrated that the naturalness of synthetic speech does improve when source-tract interaction is simulated in the synthesis process. The purpose of this paper is two-fold: 1) to present an algorithm for automatically measuring source-tract interaction for voiced speech, and 2) to present a simple speech production model that incorporates source-tract interaction into the glottal source model. This glottal source model controls: 1) the skewness of the glottal pulse, and 2) the amount of the first formant ripple superimposed on the glottal pulse. A major application of the results of this paper is the modeling of vocal disorders.

Keywords

This publication has 38 references indexed in Scilit:

Adaptive WRLS-VFF for speech analysis
IEEE Transactions on Speech and Audio Processing, 1995
Vocal quality factors: Analysis, synthesis, and perception
The Journal of the Acoustical Society of America, 1991
Analysis, synthesis, and perception of voice quality variations among female and male talkers
The Journal of the Acoustical Society of America, 1990
Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
On the measurement of glottal flow
The Journal of the Acoustical Society of America, 1988
Acoustic correlates of breathiness: First harmonic amplitude, turbulence noise, and tracheal coupling
The Journal of the Acoustical Society of America, 1987
On subglottal formant analysis
The Journal of the Acoustical Society of America, 1987
Nonlinear inverse filtering technique for estimating the glottal-area waveform
The Journal of the Acoustical Society of America, 1977
Measurement of the glottal waveform
The Journal of the Acoustical Society of America, 1975
Nature of the Vocal Cord Wave
The Journal of the Acoustical Society of America, 1959

Cited by 51 articles