Hybrid Multi-Step Disfluency Detection

Sebastian Germesin, Tilman Becker, Peter Poller

In: Andrei Popescu-Belis, Rainer Stiefelhagen (Hrsg.). Machine Learning for Multimodal Interaction. Machine Learning and Multimodal Interaction (MLMI-08) 5th International Workshop September 8-10 Utrecht Netherlands Seiten 185-195 Lecture Notes in Computer Science (LNCS) 5237 ISBN 978-3-540-85852-2 Springer Heidelberg 2008.


Previous research has shown that speech disfluencies - speech errors that occur in spoken language - affect NLP systems and hence need to be repaired or at least marked. This study presents a hybrid approach that uses different detection techniques for this task. Each of these techniques is specialized within its own disfluency domain. A thorough investigation of the used disfluency scheme led us to a detection design where basic rule-matching techniques are combined with machine learning and N-gram based approaches. The aim was both, to reduce computational overhead and processing time and also to increase the detection performance. In fact, we were able to increase the amount of clean speech material from 85.6% up to 92.2% while keeping the detection time below real-time.

Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence