Ontologies as a Source for the Automatic Generation of Grammars for Information Extraction Systems

Thierry Declerck; Paul Buitelaar

In: Diana Maynard; Marieke van Erp; Brian Davis (Hrsg.). Proceedings of the SWAIE 2012 Workshop : Semantic Web and Information Extraction . Workshop on Semantic Web and Information Extraction (SWAIE-12), located at The 18th International Conference on Knowledge Engineering and Knowledge Management, October 9, Galway, Ireland, CEURS, 10/2012.


Grammars for Natural Language Processing (NLP) applications are generally built either by linguists (on the basis of their language competence, or by automated tools applied to existing large corpora of language data) using either supervised or unsupervised methods (or a combination of both). Domain knowledge usually played just a little role in this process. The increasing availability of extended knowledge representation systems, like taxonomies and ontologies, is giving the opportunity to consider new approaches to the (automated) generation of processing grammars, especially in the field of domain-oriented Information Extraction (IE). The reason for this being that most of the taxonomies and ontologies are equipped with natural language expressions included in ontology elements like labels, comments or definitions. These de facto established relations between (domain) knowledge and natural language expressions can be exploited for the automatic generation of domain specific NLP and IE grammars. We describe in this paper steps leading to this automation.


Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence