New Bounds for Learning Intervals with Implications for Semi-Supervised Learning
David P. Helmbold and Philip M. Long JMLR W&CP 23: 30.1 - 30.15, 2012
We study learning of initial intervals in the prediction model. We show that for each distribution D over the domain, there is an algorithm AD, whose probability of a mistake in round m is at most (½ + o(1))/m. We also show that the best possible bound that can be achieved in the case in which the same algorithm A must be applied for all distributions D is at least (1⁄√e - o(1))1⁄m > (3⁄5-o(1))1⁄m. Informally, "knowing" the distribution D enables an algorithm to reduce its error rate by a constant factor strictly greater than 1. As advocated by Ben-David et al. (2008), knowledge of D can be viewed as an idealized proxy for a large number of unlabeled examples.