Consistency of Semi-Supervised Learning Algorithms on Graphs: Probit and One-Hot Methods
Franca Hoffmann, Bamdad Hosseini, Zhi Ren, Andrew M Stuart; 21(186):1−55, 2020.
Abstract
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.
[abs]
[pdf][bib]© JMLR 2020. (edit, beta) |