Text-independent speaker identification

Abstract
We describe current approaches to text-independent speaker identification based on probabilistic modeling techniques. The probabilistic approaches have largely supplanted methods based on comparisons of long-term feature averages. The probabilistic approaches have an important and basic dichotomy into nonparametric and parametric probability models. Nonparametric models have the advantage of being potentially more accurate models (though possibly more fragile) while parametric models that offer computational efficiencies and the ability to characterize the effects of the environment by the effects on the parameters. A robust speaker-identification system is presented that was able to deal with various forms of anomalies that are localized in time, such as spurious noise events and crosstalk. It is based on a segmental approach in which normalized segment scores formed the basic input for a variety of robust 43% procedures. Experimental results are presented, illustrating 59% the advantages and disadvantages of the different procedures. 64%. We show the role that cross-validation can play in determining how to weight the different sources of information when combining them into a single score. Finally we explore a Bayesian approach to measuring confidence in the decisions made, which enabled us to reject the consideration of certain tests in order to achieve an improved, predicted performance level on the tests that were retained.

This publication has 13 references indexed in Scilit: