DFKI-LT - Cross-Modal Learning Of Visual Categories Using Different Levels of Supervision

M. Fritz, Geert-Jan Kruijff, B. Schiele
Cross-Modal Learning Of Visual Categories Using Different Levels of Supervision
1 Proceedings of the International Conference on Computer Vision Systems (ICVS 2007), Bielefeld, Germany, ICVS, 3/2007
 
Today’s ob ject categorization methods use either supervised or unsupervised training methods. While supervised methods tend to produce more accurate results, unsupervised methods are highly attrac- tive due to their potential to use far more and unlabeled training data. This paper proposes a novel method that uses unsupervised training to obtain visual groupings of ob jects and a cross-modal learning scheme to overcome inherent limitations of purely unsupervised training. The method uses a unified and scale-invariant ob ject representation that al- lows to handle labeled as well as unlabeled information in a coherent way. One of the potential settings is to learn ob ject category models from many unlabeled observations and a few dialogue interactions that can be ambiguous or even erroneous. First experiments demonstrate the ability of the system to learn meaningful generalizations across ob jects already from a few dialogue interactions.
 
Files: BibTeX, fritz+etal-icvs07.pdf