Attention-Based Document Classifier Learning

Georg Buscher; Andreas Dengel

In: Proceedings of the Eighth IAPR Workshop on Document Analysis Systems. IAPR International Workshop on Document Analysis Systems (DAS-08), 8th, September 16-19, Nara, Japan, Pages 87-94, Online Proceeding, IEEE computer society, 9/2008.


We describe an approach for creating precise personalized document classifiers based on the user's attention. The general idea is to observe which parts of a document the user was interested in just before he or she comes to a classification decision. Having information about this manual classification decision and the document parts the decision was based on, we can learn precise classifiers. For observing the user's focus point of attention we use an unobtrusive eye tracking device and apply an algorithm for reading behavior detection. On this basis, we can extract terms characterizing the text parts interesting to the user and employ them for describing the class the document was assigned to by the user. Having learned classifiers in that way, new documents can be classified automatically using techniques of passage-based retrieval. We prove the very strong improvement of incorporating the user's visual attention by a case study that evaluates an attention-based term extraction method.


Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence