We will present the state-of-the-art in intelligent information
extraction (IE). The lecture will be subdivided into four major topics:
introduction, core technologies, machine learning (ML) methods and
We start with a historical overview and explain the different tasks and
evaluation methods of IE (e.g.,
template filling, domain ontologies). We summarize the core IE
functionality by contrasting rule-based and corpus-based system design.
This will also cover advanced NLP aspects like integration of shallow
and deep processing. Secondly, the participants will be faced with
major IE challenges wrt.
domain adaptivity, e.g., portability, and multi-linguality.
Consequently, we then focus on advanced ML methods for the different IE
tasks under various dimensions (supervised, unsupervised,
multi-lingual). Finally, we present different exciting applications
that embed IE as a major component, viz. open-domain question
answering, text summarization,
text data mining, and Semantic Web services.
H. Cunningham, D. Maynard, K.
Bontcheva, V. Tablan. GATE: A
Framework and Graphical Development Environment for Robust NLP Tools
and Applications. Proceedings of the 40th Anniversary Meeting of
the Association for Computational Linguistics
(ACL'02). Philadelphia, July 2002. PDF. BibTex
Technologies for Germany Texts
Berthold Crysmann, Anette Frank, Bernd Kiefer,, Stefan
Günter Neumann, Jakub Piskorski, Ulrich Schäfer, Melanie
Hans Uszkoreit, Feiyu Xu,Markus Becker and
for shallow and deep processing. In Proceedigns of ACL-2002,
for Computational Linguistics 40th Anniversary Meeting, University of
Philadelphia, July 2002.