Publikation

A Cross-Lingual German-English Framework for Open-Domain Question Answering

Bogdan Sacaleanu, Günter Neumann

In: C. Peters , P. Clough , F.C. Gey , J. Karlgren , B. Magnini , D.W. Oard , M. de Rijke , M. Stempfhuber (Hrsg.). Evaluation of Multilingual and Multi-modal Information Retrieval. Seiten 328-338 LNCS 4730 Springer Berlin, Heidelberg 2007.

Abstrakt

The paper describes QUANTICO, a cross-language open domain question answering system for German and English. The main features of the system are: use of preemptive off-line document annotation with syntactic information like chunk structures, apposition constructions and abbreviation-extension pairs for the passage retrieval; use of online translation services, language models and alignment methods for the cross-language scenarios; use of redundancy as an indicator of good answer candidates; selection of the best answers based on distance metrics defined over graph representations. Based on the question type two different strategies of answer extraction are triggered: for factoid questions answers are extracted from best IR-matched passages and selected by their redundancy and distance to the question keywords; for definition questions answers are considered to be the most redundant normalized linguistic structures with explanatory role (i.e., appositions, abbre viation's extensions). The results of evaluating the systems performance by CLEF were as follows: for the best German-German run we achieved an overall accuracy (ACC) of 42.33% and a mean reciprocal rank (MRR) of 0.45; for the best English-German run 32.98% (ACC) and 0.35 (MRR); for the German-English run 17.89% (ACC) and 0.17 (MRR).

CLEF2006.pdf (pdf, 242 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence