Border Noise Removal of Camera-Captured Document Images using Page-Frame Detection
Syed Saqib Bukhari; Faisal Shafait; Thomas Breuel
In: 4th International Workshop on Camera-Based Document Analysis and Recognition. International Workshop on Camera-Based Document Analysis and Recognition (CBDAR-11), 4th, September 22, Beijing, China, Lecture Notes in Computer Science (LNCS), Springer, 9/2011.
Camera-captured document images usually contain two main types of marginal noise: textual noise (coming from neighboring pages) and non-textual noise (resulting from the page surrounding and/or binarization process). These types of marginal noise degrade the performance of the preprocessing (dewarping) of camera-captured document images and subsequent document digitization/recognition processes. Page frame detection is one of the newly investigated areas in document image processing, which is used to remove border noise and to identify the actual content area of document images. In this paper, we present a new technique for page frame detection of camera-captured document images. We use text and nontext contents information to find the page frame of document images. We evaluate our algorithm on the DFKI-I (CBDAR 2007 Dewarping Contest) dataset. Experimental results show the effectiveness of our method in comparison to other stateof- the-art page frame detection approaches.