Gang Niu, Bo Dai, Lin Shang, Masashi Sugiyama.
Year: 2013, Volume: 14, Issue: 45, Pages: 2641−2687
The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a new discriminative clustering model based on the large volume principle called maximum volume clustering (MVC), and then propose two approximation schemes to solve this MVC model: A soft-label MVC method using sequential quadratic programming and a hard-label MVC method using semi-definite programming, respectively. The proposed MVC is theoretically advantageous for three reasons. The optimization involved in hard-label MVC is convex, and under mild conditions, the optimization involved in soft-label MVC is akin to a convex one in terms of the resulting clusters. Secondly, the soft-label MVC method possesses a clustering error bound. Thirdly, MVC includes the optimization problems of a spectral clustering, two relaxed $k$-means clustering and an information-maximization clustering as special limit cases when its regularization parameter goes to infinity. Experiments on several artificial and benchmark data sets demonstrate that the proposed MVC compares favorably with state-of-the-art clustering methods.