Intelligent Document Information Extraction

Intelligent Document Information Extraction

  • Duration:

Goal of iDocument is a semantic answering of queries on non-structured document corpora(e.g., news, large document collections). Based on existing background knowledge and an unstructured - and on the basis of size and fluctuation open - document body, an intelligent process is developed which filters out relevant documents from the document corpus with the help of a query issued by a user.

The contents of the relevant documents are transformed to a semantic network by methods of information extraction and by linkage with the background knowledge. A semantic search then examines this network with regard to the original query and delivers resulting relations and concepts to the user. With this ad hoc processing of documents for the semantic search, previously non-structured (resp. semantically enriched) sources of information are exploited to semantic search. With it another step is provided to bring the Semantic Web to life.

Publications about the project

Benjamin Adrian, Jörn Hees, Ludger van Elst, Andreas Dengel

In: Bärbel Mersching, Marcus Hund, Zaheer Aziz (editor). KI 2009: Advances in Artificial Intelligence. German Conference on Artificial Intelligence (KI-2009) September 15-18 Paderborn Germany Pages 249-256 Lecture Notes in Artificial Intelligence (LNAI) 5803 ISBN 978-3-642-04616-2 Springer-Verlag Heidelberg 9/2009.

To the publication
Stefan Dellmuth, Heiko Maus, Andreas Dengel

In: Proceedings of the Third International Workshop on Camera Based Document Analysis and Recognition. International Workshop on Camera-Based Document Analysis and Recognition (CBDAR-09) located at ICDAR 2009 July 25 Barcelona Spain 7/2009.

To the publication
Benjamin Adrian, Heiko Maus, Malte Kiesel, Andreas Dengel

In: Knut Hinkelmann, Holger Wache (editor). WM2009: 5th Conference on Professional Knowledge Management. Conference on Professional Knowledge Management (WM-2009) March 25-27 Solothurn Switzerland Lecture Notes in Informatics (LNI) P-145 ISBN 978-3-88579-239-0 Gesellschaft für Informatik Bonn 3/2009.

To the publication

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz