Publication

Reflowable Document Images for the Web

Thomas Breuel

In: Web Document Analysis 2003. International Workshop on Web Document Analysis (WDA-2003) located at ICDAR 2003 August 3 Edinburgh United Kingdom 2003.

Abstract

The paper describes on-going work on a system that transforms page-oriented document images into "reflowable document images", representations of the page image in HTML format that allows it to adapt to display devices of different sizes while preserving the original appearance of the image as much as possible and avoiding OCR errors. The approach to document layout analysis used by the system is outlined and the strengths and limitations of HTML for this application are discussed.

ReflowableDocImages.pdf (pdf, 363 KB)

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz