NOTE: The following materials are presented for timely
dissemination of academic and technical work. Copyright and all other rights
therein are reserved by authors and/or other copyright holders. Persoanl
use of the following materials is permitted and, however, people using
the materials or information are expected to adhere to the terms and
constraints invoked by the related copyright.
A Modified HME Architecture for Text-Dependent Speaker Identification
ABSTRACT
A modified hierarchical mixtures of experts (HME) architecture is presented
for text-dependent speaker identification. A new gating network is introduced
to the original HME architecture for the use of instantaneous and transitional
spectral information in text-dependent speaker identification. The statistical
model underlying the proposed architecture is presented and learning is
treated as a maximum likelihood problem; in particular, an
Expectation-Maximization (EM) algorithm is also proposed for adjusting the
parameters of the proposed architecture. An evaluation has been carried out
using a database of isolated digit utterances by 10 male talkers. Experimental
results demonstrate that the proposed architecture outperforms the original
HME architecture in text-dependent speaker identification.
Click
tnn96.pdf for full text and
tnn96c.pdf for the correction to Figure 1.