Using iDocument for Document Categorization in Nepomuk Social Semantic Desktop

Benjamin Adrian, Martin Klinkigt, Heiko Maus, Andreas Dengel

In: Adrian Paschke , Hans Weigand , Wernher Behrendt , Klaus Tochtermann , Tassilo Pellegrini (Hrsg.). Proceedings of I-KNOW 09 and I-SEMANTICS 09. International Conference on Semantic Systems (I-Semantics-09) September 2-4 Graz Austria Seiten 638-643 J.UCS Conference Proceedings Series ISBN 978-3-85125-060-2 Verlag der Technischen Universität Graz Graz 9/2009.


On the Semantic Desktop users maintain their model of the world in a formal personal information model ontology. Concepts from this ontology are used to annotate documents from desktop, allowing efficient navigation and browsing of these. However, the mental overhead required for correctly classifying new incoming document is substantial. We present the integration of the ontology-based information extraction system iDocument into the Nepomuk Semantic Desktop for classifying documents within the personal information model. A comparison is done between iDocument and the original classification system Structure Recommender. It is based on real models and documents from five Nepomuk users. Results reveal evidences that iDocument's categorization proposals are rated with higher recall and precision values and show that iDocument's result ranking corresponds to user ratings.


Weitere Links

using_idocument_for_document_categorization.pdf (pdf, 111 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence