Balanced Clustering for Content-based Image Browsing

Christopher Tim Althoff; Adrian Ulges; Andreas Dengel

In: GI-Informatiktage 2011. GI-Informatiktage (Informatik-2011), March 25-26, Bonn, Germany, Gesellschaft für Informatik e.V. 3/2011.


In recent years the explosive growth of digitally stored image and video data has raised the need for tools to search and organize visual data automatically by their content. Browsing environments, which structure image and video collections, are one solution to this problem. Therefore, image clustering techniques are needed that group semantically related images, are highly scalable, and produce balanced structures. We propose a simple and efficient strategy to enforce a more balanced clustering based on a hierarchical variant of the online k-means algorithm that favors small clusters over larger ones by adapting the prior probability of each cluster. We compare our method to standard hierarchical agglomerative techniques using multiple standard features and real-world datasets, showing that the proposed approach yields clusters of comparable qualitity while being substantially more balanced and scalable.


2011-althoff-ulges-dengel-balanced_clustering_for_content-based_image_browsing.pdf (pdf, 797 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence