A pitch-synchronous analysis of hoarseness in running speech

Abstract
A method of pitch-synchronous acoustic analysis of hoarseness requiring a voice sample of only four fundamental periods is presented. This method calculates a noise-to-signal (N/S) ratio, which indicates the depth of valleys between harmonic peaks in the power spectrum. The spectrum is calculated pitch synchronously from a Fourier transform of the signal, windowed through a continuously variable Hanning window spanning exactly four fundamental periods. A two-stage procedure is used to determine the exact duration of the four fundamental periods. An initial estimate is obtained using autocorrelation in the time domain. A more precise estimate is obtained in the frequency domain by minimizing the errors between the preliminary calculated power spectrum and the predicted spectrum spread of a window harmonic signal. Analysis of synthesized voices shoed that the N/S ratio is sensitive to additive noise, jitter, and shimmer, and is insensitive to slow (8 Hz) modulation in fundamental frequency and amplitude. An analysis of pre- and postoperative voices of six patients with benign laryngeal disease showed that the N/S ratio for vowel /u/ in running speech consistently improved after surgery for all subjects, in agreement with their successful therapeutic results.

This publication has 4 references indexed in Scilit: