Skip to main content Skip to main navigation

Publikation

Performance Comparison of Six Algorithms for Page Segmentation

Faisal Shafait; Daniel Keysers; Thomas Breuel
In: Horst Bunke; A. Lawrence Spitz (Hrsg.). 7th IAPR Workshop on Document Analysis Systems (DAS). IAPR International Workshop on Document Analysis Systems (DAS), Nelson, Pages 368-379, LNCS, Vol. 3872, Springer, 2/2006.

Zusammenfassung

This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based. The evaluation is performed using a subset of the UW-III collection commonly used for evaluation, with a separate training set for parameter optimization. We compare the results using both default parameters and optimized parameters. In the course of the evaluation, the strengths and weaknesses of each algorithm are analyzed, and it is shown that no single algorithm outperforms all other algorithms. However, we observe that the three best-performing algorithms are those based on constrained text-line finding, Docstrum, and the Voronoi-diagram.