On Benchmarking of Invoice Analysis System

Bertin Klein, Stefan Agne, Andreas Dengel

In: Horst Bunke , A. Lawrence Spitz (editor). 7th International Workshop on Document Analysis Systems, DAS 2006. IAPR International Workshop on Document Analysis Systems (DAS) Nelson Pages 312-323 Lecture Notes in Computer Science 3872 Springer-Verlag GmbH 2/2006.


An approach is presented to guide the benchmarking of invoice analysis systems, a specific, applied subclass of document analysis systems. The state of the art of benchmarking of document analysis systems is presented, based on the processing levels: Document Page Segmentation, Text Recognition, Document Classification, and Information Extraction. The restriction to invoices enables and requires a more purposeful, i.e. detailed, targetting of the benchmarking procedures (acquisition of ground truth data, system runs, comparison of data, condensation into meaningful numbers). Therefore the processing of invoices is dissected. The involved data structures are elicited and presented. These are provided, being the building blocks of the actual benchmarking of invoice analysis systems.

Weitere Links

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz