Some Technical Considerations in Voice Perturbation Measurements

Abstract
The maximum theoretical quantization noise (shimmer and jitter) for digital recording systems is estimated and compared with normal vocal shimmer and jitter. Nine bits of resolution and 500 samples per cycle are needed to minimize the contaminating sampling noise without interpolation between samples. With interpolation, however, fewer than 100 samples per cycle can resolve jitter down to 0.1%. Low-pass filtering is not harmful unless peak-picking strategies are used, and the peaks are severely broadened by the low-pass filter. A window of at least 20 cycles is suggested, and multiple tokens of an utterance are necessary to obtain a stable mean for perturbation measures.