Classification of listener linguistic vocalisations in interactive meetings

Marcela Charfuelan Oliva, Marc Schröder, Sathish Chandra Pammi

In: Proceedings of Eusipco 2011. European Signal Processing Conference (EUSIPCO-2011) 19th August 29-September 2 Barcelona Spain Eusipco 2011.


This paper presents the classification of two types of listener linguistic vocalisations that occur during spontaneous interactions in the AMI-IDIAP meeting corpus. In a first stage, principal component analysis (PCA) of low level acoustic measures is used to separate salient lower and higher acoustic events. We have found that two types of linguistic vocalisations appear very often in salient events. Among the lower salient acoustic events 44% correspond to backchannel vocalisations whereas among the higher salient events 32% correspond to stall vocalisations. In a second stage, once salient acoustic events are split into high and low two Support Vector Machine (SVM) classifiers are trained with different acoustic features to classify these two sets separately. We have got a classification accuracy of 81% and 80% for stall and backchannel linguistic vocalisations. The approach can be applied on the development of SAL (sensitive artificial listener) systems or interative systems in general.


ami_extreme_points.pdf (pdf, 138 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence