Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model
Kamalika Chaudhuri, Fan Chung and Alexander Tsiatas JMLR W&CP 23: 35.1 - 35.23, 2012
In this paper, we examine a spectral clustering algorithm for similarity graphs drawn from a simple
random graph model, where nodes are allowed to have varying degrees, and we provide theoretical
bounds on its performance. The random graph model we study is the Extended Planted Partition
(EPP) model, a variant of the classical planted partition model.
The standard approach to spectral clustering of graphs is to compute the bottom k singular vectors or eigenvectors of a suitable graph Laplacian, project the nodes of the graph onto these vectors, and then use an iterative clustering algorithm on the projected nodes. However a challenge with applying this approach to graphs generated from the EPP model is that unnormalized Laplacians do not work, and normalized Laplacians do not concentrate well when the graph has a number of low degree nodes.
We resolve this issue by introducing the notion of a degree-corrected graph Laplacian. For graphs with many low degree nodes, degree correction has a regularizing effect on the Laplacian. Our spectral clustering algorithm projects the nodes in the graph onto the bottom k right singular vectors of the degree-corrected random-walk Laplacian, and clusters the nodes in this subspace. We show guarantees on the performance of this algorithm, demonstrating that it outputs the correct partition under a wide range of parameter values. Unlike some previous work, our algorithm does not require access to any generative parameters of the model.