Asymptotics in Empirical Risk Minimization

Leila Mohammadi; Sara van de Geer

Asymptotics in Empirical Risk Minimization

Leila Mohammadi, Sara van de Geer; 6(68):2027−2047, 2005.

Abstract

In this paper, we study a two-category classification problem. We indicate the categories by labels Y=1 and Y=-1. We observe a covariate, or feature, X ∈ X ⊂ ℜ^d. Consider a collection {h_a} of classifiers indexed by a finite-dimensional parameter a, and the classifier h_a* that minimizes the prediction error over this class. The parameter a* is estimated by the empirical risk minimizer â_n over the class, where the empirical risk is calculated on a training sample of size n. We apply the Kim Pollard Theorem to show that under certain differentiability assumptions, â_n converges to a* with rate n^-1/3, and also present the asymptotic distribution of the renormalized estimator.

For example, let V₀ denote the set of x on which, given X=x, the label Y=1 is more likely (than the label Y=-1). If X is one-dimensional, the set V₀ is the union of disjoint intervals. The problem is then to estimate the thresholds of the intervals. We obtain the asymptotic distribution of the empirical risk minimizer when the classifiers have K thresholds, where K is fixed. We furthermore consider an extension to higher-dimensional X, assuming basically that V₀ has a smooth boundary in some given parametric class.

We also discuss various rates of convergence when the differentiability conditions are possibly violated. Here, we again restrict ourselves to one-dimensional X. We show that the rate is n^-1 in certain cases, and then also obtain the asymptotic distribution for the empirical prediction error.