Publikation

Border Noise Removal of Camera-Captured Document Images using Page-Frame Detection

Syed Saqib Bukhari, Faisal Shafait, Thomas Breuel

In: 4th International Workshop on Camera-Based Document Analysis and Recognition. International Workshop on Camera-Based Document Analysis and Recognition (CBDAR-2011) September 22 Beijing China Springer 2011.

Abstrakt

Camera-captured document images usually contain two main types of marginal noise: textual noise (coming from neighboring pages) and non-textual noise (resulting from the page surrounding and/or binarization process). These types of marginal noise degrade the performance of the preprocessing (dewarping) of camera-captured document images and subsequent document digitization/recognition processes. Page frame detection is one of the newly investigated areas in document image processing, which is used to remove border noise and to identify the actual content area of document images. In this paper, we present a new technique for page frame detection of camera-captured document images. We use text and non-text contents information to find the page frame of document images. We evaluate our algorithm on the DFKI-I (CBDAR 2007 Dewarping Contest) dataset. Experimental results show the effectiveness of our method in comparison to other state-of-the-art page frame detection approaches.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence