Automatic Word Ground Truth Generation for Camera Captured Documents

Sheraz Ahmed; Koichi Kise; Masakazu Iwamura; Marcus Liwicki; Andres Dengel
In: PRMU. Pattern Recognition and Media Understanding (PRMU-2013), located at CEATEC 2013, October 3-4, Makuhari Messe, Chiba, Japan, IEICE, 2013.


A database for camera captured documents is useful to train OCRs to obtain better performance. However, no dataset exists for camera captured documents because it is very laborious and costly to build these datasets manually. In this paper, a fully automatic approach allowing building the very large scale (i.e., millions of images) labeled camera captured documents dataset is proposed. The proposed approach does not require any human intervention in labeling. Evaluation of samples generated by the proposed approach shows that more than 97% of the images are correctly labeled. Novelty of the proposed approach lies in the use of document image retrieval for automatic labeling, especially for camera captured documents, which contain different distortions specific to camera, e.g., blur, perspective distortion, etc.

Weitere Links