Publication

Integration of the Thesaurus for the Social Sciences (TheSoz) in an Information Extraction System

Thierry Declerck

In: Piroska Lendvai, Kalliopi Zervanou (editor). Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH-13) located at ACL 13 August 8-8 Sofia Bulgaria Pages 90-95 Association for Computational Linguistics 8/2013.

Abstract

We present current work dealing with the in-tegration of a multilingual thesaurus for so-cial sciences in a NLP framework for sup-porting Knowledge-Driven Information Ex-traction in the field of social sciences. We describe the various steps that lead to a run-ning IE system: lexicalization of the labels of the thesaurus and semi-automatic generation of domain specific IE grammars, with their subsequent implementation in a finite state engine. Finally, we outline the actual field of application of the IE system: analysis of so-cial media for recognition of relevant topics in the context of elections.

Projekte

LaTECH_l2013_TheSoZ_Final2.pdf (pdf, 48 KB)

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz