Distributed Clustering Based on Sampling Local Density Estimates

Matthias Klusch, Stefano Lodi, Gianluca Moro

In: Proceedings of the International Joint Conference on Artificial Intelligence 2003. International Joint Conference on Artificial Intelligence (IJCAI-03) August 9-15 Acapulco Mexico Seiten 485-490 Morgan Kaufmann Publishers Inc. San Francisco 2003.


Huge amounts of data are stored in autonomous, geographically distributed sources. The discovery of previously unknown, implicit and valuable knowledge is a key aspect of the exploitation of such sources. In recent years several approaches to knowledge discovery and data mining, and in particular to clustering, have been developed, but only a few of them are designed for distributed data sources. We propose a novel distributed clustering algorithm based on non-parametric kernel density estimation, which takes into account the issues of privacy and communication costs that arise in a distributed environment.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence