NOTE: The following materials are presented for timely dissemination of academic and technical work. Copyright and all other rights therein are reserved by authors and/or other copyright holders. Persoanl use of the following materials is permitted and, however, people using the materials or information are expected to adhere to the terms and constraints invoked by the related copyright.

Capture Inter-Speaker Information with a Neural Network for Speaker Identification


Model-based approach is one of methods widely used for speaker identification, where a statistical model is used to characterize a specific speaker's voice but no inter-speaker information is involved in its parameter estimation. It is observed that inter-speaker information is very helpful in discriminating between different speakers. In this paper, we propose a novel method for the use of inter-speaker information to improve performance of a model-based speaker identification system. A neural network is employed to capture the inter-speaker information from the output space of those statistical models. In order to sufficiently utilize inter-speaker information, a rival penalized encoding rule is proposed to design supervised learning pairs. For better generalization, moreover, a query-based learning algorithm is presented to actively select the input data of interest during training of the neural network. Comparative results on the KING speech corpus show that our method leads to a considerable improvement for a model-based speaker identification system.

Click tnn2002.pdf for full text