Skip to main content Skip to main navigation


Are Deep Models Robust against Real Distortions? A Case Study on Document Image Classification

Saifullah Saifullah; Shoaib Ahmed Siddiqui; Stefan Agne; Andreas Dengel; Sheraz Ahmed
In: 26th International Conference on Pattern Recognition 2022. International Conference on Pattern Recognition (ICPR-2022), August 21-25, Montreal, Quebec, Canada, IEEE, 2022.


As deep learning models in the context of document image classification are reaching diminishing returns with nearperfect recognition scores, their robustness characteristics are poorly understood. In order to evaluate the robustness of existing state-of-the-art document image classifiers against different types of distortions that are commonly encountered in the real world, we present two separate benchmark datasets, namely RVL-CDIPD and Tobacco3482-D. The proposed benchmarks are generated by augmenting the well-known pre-existing document image classification datasets (RVL-CDIP and Tobacco3482) with 21 different types of distortions including varying severity levels. We leverage the proposed benchmark datasets to analyze the robustness characteristics of existing document image classification systems. Our analysis reveals that despite higher accuracy models exhibiting relatively higher robustness, they still severely underperform on some specific distortions, with classification accuracies dropping from ∼90% to as low as ∼40% in some cases. Interestingly, some of these high accuracy models perform even worse than the baseline AlexNet model in the presence of distortions, with the relative decline in their accuracy sometimes reaching as high as 300-450%. We envision these benchmarks to serve as a strong signal of progress in document image classification tasks, beyond the saturated accuracy metrics. The datasets and code to reproduce them is publicly available: robustness.

Weitere Links