Trainable Multiscript Orientation Detection

Joost van Beusekom, Yves Rangoni, Thomas Breuel

In: Proc. of SPIE Document Recognition and Retrieval XVII Proceedings of Document Recognition and Retrieval XVII. SPIE Conference on Document Recognition and Retrieval (DRR-10) Document Recognition and Retrieval XVII January 19-21 San José CA United States Seiten 1-8 7534 SPIE 1/2010.


Detecting the correct orientation of document images is an important step in large scale digitization processes, as most subsequent document analysis and optical character recognition methods assume upright position of the document page. Many methods have been proposed to solve the problem, most of which base on ascender to descender ratio computation. Unfortunately, this cannot be used for scripts having no descenders nor ascenders. Therefore, we present a trainable method using character similarity to compute the correct orientation. A connected component based distance measure is computed to compare the characters of the document image to characters whose orientation is known. This allows to detect the orientation for which the distance is lowest as the correct orientation. Training is easily achieved by exchanging the reference characters by characters of the script to be analyzed. Evaluation of the proposed approach showed accuracy of above 99% for Latin and Japanese script from the public UW-III and UW-II datasets. An accuracy of 98.9% was obtained for Fraktur on a non-public dataset. Comparison of the proposed method to two methods using ascender / descender ratio based orientation detection shows a significant improvement.


Weitere Links

publ-orientation-detection.pdf (pdf, 407 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence