Skip to main content Skip to main navigation


Clustering Benchmark for Characters in Historical Documents

Martin Jenckel; Syed Saqib Bukhari; Andreas Dengel
In: -. DAS 2016 Short Paper Booklet. Pages 33-34, 4/2016.


In document analysis clustering found many applications, especially in the field of Optical Character Recognition (OCR). Depending on the task these algorithms however perform very differently. Especially when clustering single characters of historical documents state of the art algorithm still lack in performance and semi-manual methods are used \cite{retinas_2015}. Therefore we take a look at some standard feature and clustering algorithm combinations and compare their performance on a character corpus extracted from the medieval novel "Narrenschiff".