Layout Error Correction using Deep Neural Networks

Srie Raam Mohan; Syed Saqib Bukhari; Andreas Dengel

In: DAS. IAPR International Workshop on Document Analysis Systems (DAS-2018), April 24-27, IEEE, 2018.


Layout Analysis, mainly including binarization and page/line segmentation, is one of the most important performance determining steps of an OCR system for complex medieval historical document images, which contain noise, distortions and irregular layouts. In this paper, we present a novel layout error correction technique which include a VGG Net to classify non- textline and adversarial network approach to obtain the layout bounding mask. The presented layout error correction technique are applied to a collection of 15th century Latin documents, which achieved more than 75% accuracy for segmentation techniques.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence