Coupled Snakelet Model for Curled Textline Segmentation of Camera-Captured Document Images

Syed Saqib Bukhari; Faisal Shafait; Thomas Breuel
In: Proceedings of the 10th International Conference on Document Analysis and Recognition. International Conference on Document Analysis and Recognition (ICDAR-09), July 26-29, Barcelona, Spain, IEEE, 7/2009.


Detection of curled textline is important for dewarping of hand-held camera-captured document images. Then baselines and the lines following the top of x-height of characters (x-lines) are estimated for dewarping. Existing curled textline segmentation approaches are sensitive to outlier points and perspective distortions. Furthermore these approaches use regression over top and bottom points of a segmented textline to estimate its x-line and baseline separately, which may results in inaccurate estimation. Here we propose a novel curled textline segmentation approach based on active contours (snakes) in which we perform segmentation by estimating the pairs of x-line and baseline; solving both problems together. Starting form a connected component we jointly trace a pair of x-line and baseline using coupled snakes and external energies of neighboring top-bottom points. We grow neighborhood region iteratively during tracing, which results in robustness to perspective distortions, and maintain a natural property of similar distance within the pair of x-line and baseline pair, which results in robustness to outlier points. We achieved 90.76% of one-to-one match-score recognition accuracy of curled textline segmentation on CBDAR 2007 Document Image Dewarping Contest dataset, with good estimation of pairs of x-line and baseline.



