DFKI-LT - Dissertation Series


Núria Bertomeu Castelló : A Memory and Attention-Based Approach to Fragment Resolution and its Application in a Question Answering System

ISBN: 3-933218-25-4
302 pages
price: € 18

order form


Fragments are syntactically incomplete utterances that convey a full message, although parts of this message are not explicitly expressed and must be recovered from the context. This thesis is concerned with the recovery of this implicitly conveyed message. Based on evidence from corpus studies, representative of both unrestricted language use (spontaneous spoken interactions) and restricted language use (written question answering interactions), fragments have been classified according to their context dependency into two main classes: a) fragments resolved according to some linguistically represented antecedent, and b) fragments resolved against a conceptual representation of some highly focused referent, either introduced in the discourse or present in the communicative context. The resolution of the second type of fragment often requires some reasoning in order to establish the connection between the focused referent and the content of the fragment.

Taking into consideration results from cognitive studies regarding discourse memory, a discourse model is proposed in which information at the linguistic level rapidly decays, while information at the conceptual level remains longer in memory. The antecedents of fragments are either semantic structural representations of previous utterances still in memory or semantic conceptual representations in the focus of attention. A formalization of this model within the HPSG framework is presented. For this purpose, an interface has been designed in the HPSG declarative grammar formalism, which allows reference to dynamic processes without compromising the declarativeness of the representation.

With the aim of obtaining information about the use of fragments in question answering (QA) interactions a Wizard-of-Oz simulation of a QA system has been conducted and the obtained corpus has been investigated regarding the use of implicit reference devices. This corpus has served as a point of reference for the implementation of the theoretical model as a discourse module with fragment resolution capabilities in a QA application.

In the implemented discourse module, fragment resolution at the linguistic level considers information from several sources, such as named-entity recognition, syntax and knowledge. Regarding resolution at the conceptual level, two alternatives have been developed: a connectionist resolution component and a hybrid one. The conceptual representation of the content of the discourse has been implemented as a symbolic neural network, where the focus of attention is determined by "spreading activation". In the hybrid approach, resolution is achieved by querying an external ontology regarding the well-formedness according to the domain model of the association between some referent in the focus of attention and the content of the fragment. In the connectionist approach the network is enriched with knowledge and resolving a fragment involves finding the most active association in the network between some referent and the content of the fragment.

This work concludes with a quantitative evaluation of the implemented solution and an analysis of its coverage. The discourse module improves the total coverage of the original QA system by 8.6%.