In: Proceedings of the Interspeech 2008. Conference in the Annual Series of Interspeech Events (INTERSPEECH-2008), September 22-26, Brisbane, Australia, 2008.
Zusammenfassung
We present two approaches on acoustic event detection for speech-enabled car applications: a generative GMM-UBM approach and a discriminative GMM-SVM supervector approach. The systems detect whether or not a certain acoustic event occurred while the built-in microphone of the car was active to record a spoken command, either before, while, or after the driver was speaking. These events can be music playing, phone ringing, a passenger different from the driver is talking, laughing, or coughing. The task is formally defined as a detection task along the lines of well established detection tasks such as speaker recognition or language recognition. Similarly, the evaluation procedure has been designed to resemble the respective official evaluation series performed by NIST (i.e. it was a blind - one-shot - evaluation on a separately provided dataset). The performance of the system was calculated in terms of detection miss and false alarm probabilities (CMiss = CFA = 1, and PTarget = 0.5). The performance of the superior GMMSVM system was 0.0345 for known test speakers and 0.1955 for novel test speakers. Frequency-filtered band energy coefficients (FFBE) outperformed MFCCS on that task. The results are promising and suggest further experiments on more data.
@inproceedings{pub4411,
author = {
Müller, Christian
and
Biel, Joan-Isaac
and
Kim, Edward
and
Rosario, Daniel
},
title = {Speech-overlapped Acoustic Event Detection for Automotive Applications},
booktitle = {Proceedings of the Interspeech 2008. Conference in the Annual Series of Interspeech Events (INTERSPEECH-2008), September 22-26, Brisbane, Australia},
year = {2008}
}
Deutsches Forschungszentrum für Künstliche Intelligenz German Research Center for Artificial Intelligence