PHD

A pdf-version of my thesis is here.

Abstract

This thesis describes the result of several years of research focusing on a small part of natural language processing: with an overall goal of language-independence, a robust approach to dialogue modeling and discourse processing. The main context in which the approach has been developed is VerbMobil---a project for a speech-to-speech translation system for spontaneously spoken negotiation dialogue.

The first major contribution of the thesis is a dialogue model based on propositional content and intentions. Two distinct parts of the model are discussed in depth: (a) An approach to the modeling and tracking of propositional content that is based on description logics and default unification. This technique has been developed further within the SmartKom project---a project for symmetric multimodal dialogue. This is true in particular for the advancement and formalization of \emph{overlay}, a default unification algorithm in combination with a scoring function that together serve as the main operation for the contextual interpretation of user hypotheses. (b) An approach to the modeling and tracking of intentions that is based on dialogue acts, dialogue moves and dialogue games together with dialogue phases. These building blocks can be arranged in such a way that the complete dialogue is described on five different levels in a language-independent way. A new characterization called \emph{dialogue moves} is introduced that encompasses several dialogue acts.

In the second major contribution of the thesis, an application based on the content of the discourse memory is described: multilingual generation of summaries. For negotiative dialogue, the discourse memory contains at the end of the dialogue amongst other discourse objects those objects that have been agreed upon by both interlocutors. On the basis of these, a data-driven bottom-up generation algorithm is described that together with two already existing modules of the VerbMobil system---the transfer module and the multilingual generator GECO---produces summaries in any language deployed by the system.

The third major contribution of the thesis is the evaluation of the summarization functionality of VerbMobil. A comprehensive evaluation shows the validity of our approach to dialogue modeling. Additionally, it is shown that standard evaluation metrics based on precision and recall fail to describe the performance of the summarization functionality correctly. A crucial finding is that erroneous processing in any processing step inserts discourse objects that were not part of the dialogue into the discourse memory. These objects---called \emph{confabulations}---eventually appear in the summaries. The standard evaluation metrics precision and recall are based on the assumption that a subset of those discourse objects that are actually mentioned by the interlocutors are correctly or erroneously selected. If these metrics are used, the number of confabulative errors committed by the system is never revealed. Therefore, a more extensive evaluation distinguishing confabulations from true positive discourse objects is described. Through the use of two new metrics---\emph{relative} and \emph{total} confabulation---a more honest characterization of the system performance is presented.

In the conclusion, we summarize the thesis and suggest further research directions. Finally, the appendix shows excerpts from the annotation manuals for dialogue acts, moves and games together with some corpus characteristics. It also shows traces of two sample dialogues processed by VerbMobil, along with the corresponding summaries.

Jan Alexandersson