OCR Based Thresholding

Yves Rangoni, Faisal Shafait, Thomas Breuel

In: Proceedings of the 11th IAPR Conference on Machine Vision Applications. IAPR Conference on Machine Vision Applications (MVA-2009) May 20-22 Yokohama Japan ISBN 978-4-901122-09-2 Springer-Verlag 5/2009.


In large-scale digitization processes, several common tasks are performed to provide an electronic version of a paper document. One of the first steps is the thresholding of the image, which is necessary for the following procedures to work correctly. Many (binarization) methods have been proposed to solve this problem, but they need to be tuned on the target document corpus to obtain best results. In this paper, we introduce a full automatic thresholding method for printed document analysis. The purpose is to automatically obtain the most suitable parameters of a binarizer for a given document image according to the quality of the output of an OCR system. Tuning can be done either on a full page or on sample text-lines extracted from a page image. As opposed to existing methods, the tuning is directly goal-directed and does neither depend on subjective visual evaluation nor on non-representative performance criteria. We demonstrate the effectiveness of the approach on a subset of 740 pages from the Google 1000 Books dataset. Results show, that by choosing the right binarizer parameters with the Recognition Driven Thresholding (RDT) method the words-in-dictionary error rate of an OCR system can be reduced by 6%.


2009-IUPR-26Feb_1318.pdf (pdf, 830 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence