Skip to main content Skip to main navigation


Distributed Clustering Based on Sampling Local Density Estimates

Matthias Klusch; Stefano Lodi; Gianluca Moro
In: Proceedings of the International Joint Conference on Artificial Intelligence 2003. International Joint Conference on Artificial Intelligence (IJCAI-03), August 9-15, Acapulco, Mexico, Pages 485-490, Morgan Kaufmann Publishers Inc. San Francisco, 2003.


Huge amounts of data are stored in autonomous, geographically distributed sources. The discovery of previously unknown, implicit and valuable knowledge is a key aspect of the exploitation of such sources. In recent years several approaches to knowledge discovery and data mining, and in particular to clustering, have been developed, but only a few of them are designed for distributed data sources. We propose a novel distributed clustering algorithm based on non-parametric kernel density estimation, which takes into account the issues of privacy and communication costs that arise in a distributed environment.