Bootstrapping an Ontology-based Information Extraction System.

Alexander Maedche; Günter Neumann; Steffen Staab

In: Piotr S. Szczepaniak; Javier Segovia; Janusz Kacprzyk; Lotfi A. Zadeh (Hrsg.). Intelligent Exploration of the Web. Pages 345-359, Studies in Fuzziness and Soft Computing, Vol. 111, ISBN 978-3-7908-1529-0, Springer/Physica-Verlag GmbH, Heidelberg, 2003.


Automatic intelligent web exploration will benefit from shallow information extraction techniques if the latter can be brought to work within many different domains. The major bottleneck for this, however, lies in the so far difficult and expensive modeling of lexical knowledge, extraction rules, and an ontology that together define the information extraction system. In this paper we present a bootstrapping approach that allows for the fast creation of an ontology-based information extracting system relying on several basic components, viz. a core information extraction system, an ontology engineering environment and an inference engine. We make extensive use of machine learning techniques to support the semi-automatic, incremental bootstrapping of the domain-specific target information extraction system.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence