Agnostic Learning of Disjunctions on Symmetric Distributions
Vitaly Feldman, Pravesh Kothari.
Year: 2015, Volume: 16, Issue: 106, Pages: 3455−3467
Abstract
We consider the problem of approximating and learning disjunctions (or equivalently, conjunctions) on symmetric distributions over $\{0,1\}^n$. Symmetric distributions are distributions whose PDF is invariant under any permutation of the variables. We prove that for every symmetric distribution $\mathcal{D}$, there exists a set of $n^{O(\log{(1/\epsilon)})}$ functions $\mathbb{S}$, such that for every disjunction $c$, there is function $p$, expressible as a linear combination of functions in $\mathbb{S}$, such that $p$ $\epsilon$-approximates $c$ in $\ell_1$ distance on $\mathcal{D}$ or $\mathbf{E}_{x \sim \mathcal{D}}[ |c(x)-p(x)|] \leq \epsilon$. This implies an agnostic learning algorithm for disjunctions on symmetric distributions that runs in time $n^{O( \log{(1/\epsilon)})}$. The best known previous bound is $n^{O(1/\epsilon^4)}$ and follows from approximation of the more general class of halfspaces (Wimmer, 2010). We also show that there exists a symmetric distribution $\mathcal{D}$, such that the minimum degree of a polynomial that $1/3$-approximates the disjunction of all $n$ variables in $\ell_1$ distance on $\mathcal{D}$ is $\Omega(\sqrt{n})$. Therefore the learning result above cannot be achieved via $\ell_1$-regression with a polynomial basis used in most other agnostic learning algorithms.
Our technique also gives a simple proof that for any product distribution $\mathcal{D}$ and every disjunction $c$, there exists a polynomial $p$ of degree $O(\log{(1/\epsilon)})$ such that $p$ $\epsilon$-approximates $c$ in $\ell_1$ distance on $\mathcal{D}$. This was first proved by Blais et al. (2008) via a more involved argument.