Improved document image segmentation algorithm using multiresolution morphology

Syed Saqib Bukhari, Faisal Shafait, Thomas Breuel

In: SPIE Document Recognition and Retrieval XVIII. SPIE Conference on Document Recognition and Retrieval (DRR-11) January 23-27 San Francisco CA United States SPIE 1/2011.


Page segmentation into text and non-text components is an essential preprocessing step before OCR operation. If this is not done properly, an OCR classification engine produces garbage text due to the presence of non-text components. This paper describes improvements to the text/image segmentation algorithm described by Bloomberg,1 which is also available in his open-source Leptonica library.2 The modifications result in significant improvements over Bloomberg's algorithm on UW-III, UNLV, ICDAR 2009 page segmentation competition test images and circuit diagram datasets.


Bukhari-text-image-segmentation-DRR11.pdf (pdf, 2 MB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence