Table Localization and Segmentation using GAN and CNN

Mohammad Mohsin Reza, Syed Saqib Bukhari, Martin Jenckel, Andreas Dengel

In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW). International Conference on Document Analysis and Recognition Workshops (ICDARW-2019) September 22-25 Sydney Australia Seiten 152-157 5 ISBN 978-1-7281-5055-0 Association for Computing Machinery 9/2019.


Table localization and segmentation is an important but critical step in document image analysis. Table segmentation is much harder than table localization particularly in the invoice document because sometimes there are nested rows or nested columns or even nested table in an invoice. Moreover, rows or columns are very close to each other and sometimes columns overlaps with each other. Most of the existing techniques fail to generalize because they rely on hand engineered features which are not robust to layout variations. Recently, deep learning approaches are applied in table localization and segmentation which achieved promising result. However, these techniques are mostly applied to contemporary document images like UNLV or Marmot datasets. Additionally, there is still some limitation to generalize them on different layout variations and preprocessing. In this paper, we have applied conditional Generative Adversarial Networks (cGAN) based architecture for table area localization and SegNet based encoder-decoder with skip connections architecture for table structure segmentation. We applied ICDAR 2013 table competition dataset for evaluating the table localization performance. Result shows that our approaches are outperforming with existing models. On the other hand, we used private complex invoice dataset for table area segmentation.

Weitere Links

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence