Skip to main content Skip to main navigation





  • Duration:

Due to missing first line inspection in many automated digitization setups, it has become more difficult to identify forged documents. Widespread availability of high-quality printing and scanning devices have further elevated the problem by enabling even non-experts to generate high-quality forgeries. When training a machine learning system for forgery detection, one is faced with several challenges like unbalanced classes, or even absence of one class (no real forgeries might be available to train the system).

The AnDruDok project aims at bringing together research in document forensics and anomaly detection for identifying suspicious documents in a document collection. The main objective in this project is to investigate unsupervised machine learning techniques for forgery detection in document images. Particularly, the approaches based on modeling class distributions will be investigated to develop algorithms that can detect forged documents as outliers in the document collection.


Stiftung Rheinland-Pfalz für Innovation

Publications about the project

Markus Goldstein

PhD-Thesis, Technische Universität Kaiserslautern, ISBN 978-3-8439-1572-4, Dr. Hut, München, 2/2014.

To the publication

Mennatallah Amer; Markus Goldstein; Slim Abdennadher

In: Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description (ODD). International Conference on Knowledge Discovery and Data Mining (KDD-2013), August 11-14, Chicago, IL, USA, Pages 8-15, ISBN 978-1-4503-2335-2, ACM, New York, NY, USA, 8/2013.

To the publication

Johann Gebhardt; Markus Goldstein; Faisal Shafait; Andreas Dengel

In: Proceedings of the 12th International Conference on Document Analysis and Recognition. International Conference on Document Analysis and Recognition (ICDAR-2013), 12th, August 25-28, Washington, DC, USA, Pages 479-483, ISBN 978-0-7695-4999-3, IEEE Computer Society, 8/2013.

To the publication