Hierarchical Convex NMF for Clustering Massive DataKristian Kersting; Mirwaes Wahabzada; Christian Thurau; Christian Bauckhage
In: Masashi Sugiyama; Qiang Yang (Hrsg.). Proceedings of the 2nd Asian Conference on Machine Learning. Asian Conference on Machine Learning (ACML-2010), November 8-10, Tokyo, Japan, Pages 253-268, JMLR Proceedings, Vol. 13, JMLR.org, 2010.
We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization or Archetypal Analysis. CHNMF factorizes a non-negative data matrix into two non-negative matrix factors such that the columns of are convex combinations of certain data points so that they are readily interpretable to data analysts. There is, however, no free lunch: imposing convexity constraints on W typically prevents adaptation to intrinsic, low dimensional structures in the data. Alas, in cases where the data is distributed in a non-convex manner or consists of mixtures of lower dimensional convex distributions, the cluster representatives obtained from CH-NMF will be less meaningful. In this paper, we present a hierarchical CH-NMF that automatically adapts to internal structures of a dataset, hence it yields meaningful and interpretable clusters for non-convex datasets. This is also confirmed by our extensive evaluation on DBLP publication records of authors, images harvested from the web, and votes on World of Warcraft guilds.