Combining Regression and Classification Methods for Improving Automatic Speaker Age Recognition

Charl van Heerden, Etienne Barnard, Marelie Davel, Christiaan van der Walt, Ewald van Dyk, Michael Feld, Christian Müller

In: Proceedings of the the 35th International Conference on Acoustics, Speech, and Signal Processing. International Conference on Acoustics, Speech and Signal Processing (ICASSP-2010) March 14-19 Dallas TX United States IEEE Computer Society 3/2010.


We present a novel approach to automatic speaker age classification, which combines regression and classification to achieve competitive classification accuracy on telephone speech. Support vector machine regression is used to generate finer age estimates, which are combined with the posterior probabilities of well-trained discriminative gender classifiers to predict both the age and gender of a speaker. We show that this combination performs better than direct 7-class classifiers. The regressors and classifiers are trained using longterm features such as pitch and formants, as well as shortterm (frame-based) features derived from MAP adaptation of GMMs that were trained on MFCCs.

barnard10combiningregression.pdf (pdf, 2 MB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz