Table Localization and Field Value Extraction in Piping and Instrumentation Diagram Images

Arka Sinha; Johannes Bayer; Syed Saqib Bukhari
In: Proceddings of ICDAR 2019. International Conference on Document Analysis and Recognition (ICDAR-2019), located at 15th, September 20-21, Sydney, NSW, Australia, IAPR, 9/2019.


Piping and Instrumentation Diagrams (P&IDs) are graph-based engineering drawings utilised in process engineering. These documents also contain aditional information in tabular form. In this paper, the localisation and extraction of information of these tables are investigated. Documents used in this context are scanned raster version of P&IDs with tabular data inside a frame. The objective is to extract fields information from these tabular structures. This process is mainly divided into table localisation and then table field extraction from the segmented tables. The table localization task is achieved primarily with contour detection methods of computer vision. For the field-value extraction, a combination of rule-based keywords and navigation approach is used, utilising an Optical Character Recognition (OCR) for text extraction and regular expression for string comparison. This paper describes application of this extendable approach to the P&ID domain, where it achieved a promising result on a private dataset.