Extracting and Querying Relations in Scientific Papers

Ulrich Schäfer, Hans Uszkoreit, Christian Federmann, Torsten Marek, Yajing Zhang

In: Andreas Dengel , K. Berns , Thomas Breuel , Frank Bomarius , Thomas Roth-Berghofer (Hrsg.). Proceedings of the 31st Annual German Conference on Artificial Intelligence. German Conference on Artificial Intelligence (KI-2008) 31st September 23-26 Kaiserslautern Germany Seiten 127-134 Lecture Notes in Artificial Intelligence (LNAI) 5243 ISBN 9783540858447 Springer Heidelberg 2008.


High-precision linguistic and semantic analysis of scientific texts is an emerging research area. We describe methods and an application for extracting interesting factual relations from scientific texts in computational linguistics and language technology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting `quriples' are stored in a database from where they can be retrieved by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist's Workbench. It supports researchers in editing and online-searching scientific papers.


Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence