Extracting and Querying Relations in Scientific Papers on Language Technology

Ulrich Schäfer, Hans Uszkoreit, Christian Federmann, Torsten Marek, Yajing Zhang

In: Proceedings of the 6th International Conference on Language Resources and Evaluation. International Conference on Language Resources and Evaluation (LREC-08) Marrakesh Morocco Seiten 3040-3046 ELRA 5/2008.


We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the ACL Anthology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting `quriples' are stored in a database from where they can be retrieved (again using abstraction methods) by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist's Workbench. It supports researchers in editing and online-searching scientific papers.

Weitere Links

hylap-aiama-lrec08.pdf (pdf, 571 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence