Publikation

Designing a Structured Lexicon for Document Image Analysis

Rainer Hoch, Michael Malburg

DFKI DFKI Research Reports (RR) 92-32 1992.

Abstrakt

This paper presents a structured, multi-level architecture of a lexicon which is a central component of our knowledge-based document analysis system. Our system has the task to transform incoming business letters into an equivalent electronic representation automatically. Moreover, partial text analysis and understanding of a letter's body and relevant parts are initiated to enrich the conceptual knowledge about the actual document (e.g., by a classification). In such an application domain, a well-designed lexicon has to consider requirements of both, text recognition and text analysis. For that purpose, we propose an appropriate lexicon architecture and the internal structure of corresponding lexical entries being a prerequisite for successful higher-level interpretations of documents.

RR-92-32.pdf (pdf, 17 MB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence