Towards a Multilingual Approach on Speaker Classification

Christian Müller, Michael Feld

In: Proceedings of the 11th International Conference "Speech and Computer" SPECOM 2006. Speech and Computer Conference (SPECOM) St. Petersburg Seiten 120-124 Anatolya Publishers 2006.


This paper outlines a framework for a multilingual speaker classification system which is based on an underlying language identification module. First, the Agender speaker classification technology is introduced, a two-layered approach which primarily recognizes the speakers' age and gender but also incorporates novel domain-independent aspects that can be applied to other speaker characteristics like emotions or cognitive load. Then, it is pointed out that one of its major drawbacks consists of the fact that it has not been verified that the chosen set of speech features also works for other languages, especially for those with different phonological aspects. To overcome this drawback, it is suggested to extend Agender with a language identification module. The module presented here is designed to meet the requirements of a specific telephone-based application (which itself is not within the focus of this paper): The languages German, English and Turkish shall be discriminated on the basis of the initial utterance of the speaker; for each of the possible languages, hypotheses about the nature of the initial utterance are available; the domain encompasses a list of English product names. Although the suggested method is as yet only partly implemented, the first evaluation results are very promising: Turkish could be identified with an accuracy of 71.75%, German with an accuracy of 78.39%, and English with an accuracy of 79.89%. Besides this, the paper outlines the use of the language identification module within a multilingual version of Agender.


specom06.pdf (pdf, 202 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence