Submatrix localization via message passing
Bruce Hajek, Yihong Wu, Jiaming Xu; 18(186):1−52, 2018.
Abstract
The principal submatrix localization problem deals with recovering a K×K principal submatrix of elevated mean μ in a large n×n symmetric matrix subject to additive standard Gaussian noise, or more generally, mean zero, variance one, subgaussian noise. This problem serves as a prototypical example for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime Ω(√n)≤K≤o(n), the support of the submatrix can be weakly recovered (with o(K) misclassification errors on average) by an optimized message passing algorithm if λ=μ2K2/n, the signal-to-noise ratio, exceeds 1/e. This extends a result by Deshpande and Montanari previously obtained for K=Θ(√n) and μ=Θ(1). In addition, the algorithm can be combined with a voting procedure to achieve the information-theoretic limit of exact recovery with sharp constants for all K≥nlogn(18e+o(1)). The total running time of the algorithm is O(n2logn). Another version of the submatrix localization problem, known as noisy biclustering, aims to recover a K1×K2 submatrix of elevated mean μ in a large n1×n2 Gaussian matrix. The optimized message passing algorithm and its analysis are adapted to the bicluster problem assuming Ω(√ni)≤Ki≤o(ni) and K1≍K2. A sharp information-theoretic condition for the weak recovery of both clusters is also identified.
© JMLR 2018. (edit, beta) |