A Multi-Layer Approach to the Extraction of Schema Components of Ontologies from Text

Mihaela Vela, Thierry Declerck

In: Proceedings SEMAPRO 2011. International Conference on Advances in Semantic Processing (SEMAPRO-2011) November 20-25 Lisbon Portugal XPS (Xpert Publishing Services) 11/2011.


We describe actual work on the derivation of ontological structures from textual analysis, defining for this an incremental set of rules applied to the intermediate results of a multi-layered processing of textual documents. The first step of the processing consists in applying on plain text basic linguistic heuristics, which are formulated in textual patterns, for identifying relevant segments out of which candidate ontology classes and relations can be extracted with the help of heuristic rules. The second step proposes a consolidation of those candidates on the basis of the analysis of the text containing and surrounding the segments, and which has been annotated with part-of-speech, morphology and lexical semantic information. The last step is dealing with the refinement of the extracted ontology structures, on the basis of the constituency and dependency analysis of textual segments containing textual elements that have been delivering consolidated candidates for the extraction of ontology classes and relations. We show how these three steps offer different opportunities for ontology extraction.


German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz