NOTE: The following materials are presented for timely
dissemination of academic and technical work. Copyright and all other rights
therein are reserved by authors and/or other copyright holders. Persoanl
use of the following materials is permitted and, however, people using
the materials or information are expected to adhere to the terms and
constraints invoked by the related copyright.
On the Use of Different Representations for Speaker Modeling
ABSTRACT
Numerous speech representations have been reported to be useful in speaker
recognition. However, there is much less agreement on which speech
representation provides a perfect representation of
speaker-specific information conveyed in a speech signal.
Unlike previous work, we propose an
alternative approach to speaker modeling by the simultaneous use of
different speech representations in an optimal way. Inspired by our previous
empirical studies, we present a soft competition scheme on different
speech representations in order to exploit different speech representations
in encoding speaker-specific information. On the basis of this soft
competition scheme, we present a parametric statistical model,
generalized Gaussian mixture model, to characterize a speaker identity
based on different speech representations. Moreover, we develop an
expectation-maximization (EM) algorithm for parameter estimation
in the generalized Gaussian mixture model. The proposed speaker modeling
approach has been applied to text-independent speaker recognition and
comparative results on the KING speech corpus demonstrate its effectiveness.
Click
tsmcc2005.pdf
for full text