Skip to main content Skip to main navigation

Publication

Extraction, Merging, and Monitoring of Company Data from Heterogeneous Sources

Christian Federmann; Thierry Declerck
In: Daniel Tapias Mike Rosner Stelios Piperidis Jan Odjik Joseph Mariani Bente Maegaard Khalid Choukri Nicoletta Calzolari (Conference Chair) (Hrsg.). Proceedings of the Seventh conference on International Language Resources and Evaluation. International Conference on Language Resources and Evaluation (LREC-10), Seventh, May 19-21, Valletta, Malta, ISBN 2-9517408-6-7, European Language Resources Association (ELRA), 5/2010.

Abstract

We describe the implementation of an enterprise monitoring system that buildson an ontology-based information extraction (OBIE) component applied toheterogeneous data sources. The OBIE component consists of several IE modules -each extracting on a regular temporal basis a specific fraction of company datafrom a given data source - and a merging tool, which is used to aggregate allthe extracted information about a company. The full set of information aboutcompanies, which is to be extracted and merged by the OBIE component, is givenin the schema of a domain ontology, which is guiding the information extractionprocess. The monitoring system, in case it detects changes in the extracted andmerged information on a company with respect to the actual state of theknowledge base of the underlying ontology, ensures the update of the populationof the ontology. As we are using an ontology extended with temporalinformation, the system is able to assign time intervals to any of the objectinstances. Additionally, detected changes can be communicated to end-users, whocan validate and possibly correct the resulting updates in the knowledge base.