## Learning Halfspaces with Malicious Noise

** Adam R. Klivans, Philip M. Long, Rocco A. Servedio**; 10(94):2715−2740, 2009.

### Abstract

We give new algorithms for learning halfspaces in the challenging *
malicious noise* model, where an adversary may corrupt both the labels
and the underlying distribution of examples. Our algorithms can
tolerate malicious noise rates exponentially larger than previous work
in terms of the dependence on the dimension *n*, and succeed for the
fairly broad class of all isotropic log-concave distributions.
We give poly(*n*, 1/ε)-time algorithms for solving the following
problems to accuracy ε:

- Learning origin-centered halfspaces in
**R**^{n}with respect to the uniform distribution on the unit ball with malicious noise rate η = Ω(ε^{2}/ log(*n*/ε)). (The best previous result was Ω(ε / (*n*log(*n*/ε))^{1/4}).) - Learning origin-centered halfspaces with respect to any
isotropic log-concave distribution on
**R**^{n}with malicious noise rate η = Ω(ε^{3}/ log^{2}(*n*/ε)). This is the first efficient algorithm for learning under isotropic log-concave distributions in the presence of malicious noise.

*n*,1/ε)-time algorithm for learning origin-centered halfspaces under any isotropic log-concave distribution on

**R**

^{n}in the presence of

*adversarial label noise*at rate η = Ω(ε

^{3}/ log(1/ε)). In the adversarial label noise setting (or agnostic model), labels can be noisy, but not example points themselves. Previous results could handle η = Ω(ε) but had running time exponential in an unspecified function of 1/ε.

Our analysis crucially exploits both concentration and anti-concentration properties of isotropic log-concave distributions. Our algorithms combine an iterative outlier removal procedure using Principal Component Analysis together with "smooth" boosting.

© JMLR 2009. (edit, beta) |