Combining Alignment Results for Historical Handwritten Document Analysis

E. Indermühle, Marcus Liwicki, Horst Bunke

In: Proceedings of the 10th International Conference on Document Analysis and Recognition 2009. International Conference on Document Analysis and Recognition (ICDAR-09) July 26-29 Barcelona Spain Seiten 1186-1190 7/2009.


In this paper we propose a new strategy for combining the outputs of several alignment systems. Based on the word boundaries retrieved from a number of individual alignment systems, the new boundaries are estimated. We investigate three strategies for this estimation. First, the mean value of the individual boundaries is taken, second the median is selected, and third, confidence values of the alignment systems are considered. We apply the combination strategies on a word mapping system for historical handwritten manuscripts. After some preprocessing and normalizing steps, three differently trained hidden Markov model based handwriting recognizers are applied to the text lines in forced alignment mode. As a result, the positions of the word boundaries are obtained. In in a number of experiments it is shown that a combination strategy based on the median outperforms the others and all individual alignment systems with a word mapping rate of about 95%.


Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence