Document Inspection Using Text-Line Alignment

Joost van Beusekom, Faisal Shafait, Thomas Breuel

In: 9th IAPR Workshop on Document Analysis Systems. IAPR International Workshop on Document Analysis Systems (DAS-2010) June 9-11 Boston MA United States ACM 6/2010.


Passports, ID cards, banknotes, and degrees are considered as valuable documents that need to be secured against forgery. Apart from those, there are many other document types that are valuable, too, but that do not have any security features, as e.g. bills and vouchers. These may be used by fraudulent people to defraud money from e.g. a car insurance company. The wide availability of scanning and printing hardware allows even non-experts to easily forge a document. We therefore present a new aspect in the examination of intrinsic document features for optical document security: the goal is to automatically detect text-lines that have been manipulated or additionally inserted in a document by inspecting their alignment (left, right or center) with respect to the other text-lines in the document. This constitutes an additional feature in the goal of developing a powerful toolbox for automatic document inspection. Using the extracted text-lines, the alignment margins are extracted. Statistics on the distances of the text-lines to the alignment margins are used to identify lines that might have been forged. Such documents can then be presented to a human operator for further inspection. Due to lack of public datasets containing forged documents, a new dataset had to be created. Evaluation showed a classification accuracy of 90.5%.


Beusekom-Document-Forgery-Detection-DAS10.pdf (pdf, 1 MB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence