Skip to main content Skip to main navigation

Publication

Modeling of Image Variability for Recognition

Daniel Keysers
PhD-Thesis, RWTH Aachen University, Aachen, Germany, 2006.

Abstract

This thesis presents the application of different models of image variability to visual recog- nition problems using the paradigm of appearance-based recognition. We first discuss linear models of variability and relate them to the use of Gaussian distributions. This allows us to use well-understood estimation methods to determine the vectors representing the vari- ability. We also relate the discriminative maximum entropy approach to the Gaussian case and use the relationship to derive the novel maximum entropy linear discriminant analysis. Secondly, we investigate discrete deformation models -- that map pixels onto pixels -- of order zero, one, and two, where the order is determined by the constraints imposed on the two-dimensional image distortion. We prove for the first time that the determination of the best match for the second order model belongs to the class of NP-hard problems. We show that it is important to include a suitable context for each pixel to achieve low error rates, which is then possible using the less complex models of lower order. We furthermore discuss the use of local patches for visual object categorization as a model allowing high image variability and show how the use of discriminative training leads to very competitive results. Finally, we describe a model for holistic scene analysis that allows us to determine a visual representation of objects present in a set of images. The methods are primarily applied to the tasks of handwritten character recognition and medical image categorization, yielding excellent results in both cases. In particular, we achieve an error rate of 0.52% on the well-known MNIST benchmark and 12.6% on the IRMA-10,000 database, the lowest within the 2005 ImageCLEF evaluation. We show that the models of image variability also improve the recognition performance of appearance- based sign language and gesture recognition systems. This emphasizes the models' broad applicability.