Iterative and Active Graph Clustering Using Trace Norm Minimization Without Cluster Size Constraints

Nir Ailon, Yudong Chen, Huan Xu.

Year: 2015, Volume: 16, Issue: 14, Pages: 455−490


This paper investigates graph clustering under the planted partition model in the presence of small clusters. Traditional results dictate that for an algorithm to provably correctly recover the underlying clusters, all clusters must be sufficiently large---in particular, the cluster sizes need to be $\tilde{\Omega}(\sqrt{n})$, where $n$ is the number of nodes of the graph. We show that this is not really a restriction: by a refined analysis of a convex-optimization-based recovery approach, we prove that small clusters, under certain mild assumptions, do not hinder recovery of large ones. Based on this result, we further devise an iterative algorithm to provably recover almost all clusters via a “peeling strategy”: we recover large clusters first, leading to a reduced problem, and repeat this procedure. These results are extended to the partial observation setting, in which only a (chosen) part of the graph is observed. The peeling strategy gives rise to an active learning algorithm, in which edges adjacent to smaller clusters are queried more often after large clusters are learned (and removed). We expect that the idea of iterative peeling---that is, sequentially identifying a subset of the clusters and reducing the problem to a smaller one---is useful more broadly beyond the specific implementations (based on convex optimization) used in this paper.