Comparison between U-Net and U-ReNet models in OCR tasks

Brian Moser, Federico Raue, Jörn Hees, Andreas Dengel

In: Proceedings of the 28th International Conference on Artificial Neural Networks. International Conference on Artificial Neural Networks (ICANN-2019) September 17-19 Munich Germany Springer 9/2019.


The goal of this paper is to analyze RNNs instead of CNNs in the U-Net architecture and to show the benefit of RNNs in comparison to CNNs: They make the architecture more translation robust. In order to show this, we build an U-Net architecture with mostly ReNet layers, including a RNN-based upsampling step. This U-ReNet approach is not novel but it was never explored without using deep convolutional layers. We evaluate the presented model in two generated datasets of text lines based on MNIST. The task in both datasets is to separate digits in text lines of partially overlapping sequences. Our model reaches the best performance in one dataset and the second-best performance in the other dataset. Additionally, our model has fewer parameters in comparison to U-Net.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence