Efficient and Scalable k-Means on GPUs

Clemens Lutz, Sebastian Breß, Tilmann Rabl, Steffen Zeuch, Volker Markl

In: Theo Härder, Ralf Schenkel (editor). Datenbank-Spektrum Schwerpunktbeitrag 13222 Pages 1-13 Springer 2018.


k-Means is a versatile clustering algorithm widely used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU. We identify two main shortcomings of this approach. First, it requires expensive data exchange between proces- sors when switching between the two processing steps point assignment and centroid update. Second, even when processing both steps of k-means on the same processor, points still need to be read two times within an iteration, leading to inefficient use of memory bandwidth. In this paper, we present a novel approach for centroid update that allows us to efficiently process both phases of k-means on GPUs. We fuse point assignment and centroid update to execute one iteration with a single pass over the points. Our evaluation shows that our k-means approach scales to very large data sets. Overall, we achieve up to 20× higher throughput compared to the state-of-the-art approach.


datenbanken_spektrum_2018_camera_ready.pdf (pdf, 282 KB)

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz