The Nyström method for convex loss functions
Andrea Della Vecchia, Ernesto De Vito, Jaouad Mourtada, Lorenzo Rosasco; 25(360):1−60, 2024.
Abstract
We investigate an extension of classical empirical risk minimization, where the hypothesis space consists of a random subspace within a given Hilbert space. Specifically, we examine the Nyström method where the subspaces are defined by a random subset of the data. This approach recovers Nyström approximations used in kernel methods as a specific case. Using random subspaces naturally leads to computational advantages, but a key question is whether it compromises the learning accuracy. Recently, the tradeoffs between statistics and computation have been explored for the square loss and self-concordant losses, such as the logistic loss. In this paper, we extend these analyses to general convex Lipschitz losses, which may lack smoothness, such as the hinge loss used in support vector machines. Our main results show the existence of various scenarios where computational gains can be achieved without sacrificing learning performance. When specialized to smooth loss functions, our analysis recovers most previous results. Moreover, it allows to consider classification problems and translate the surrogate risk bounds into classification error bounds. Indeed, this gives the opportunity to compare the effect of Nyström approximations when combined with different loss functions such as the hinge or the square loss.
[abs]
[pdf][bib]© JMLR 2024. (edit, beta) |