Estimating Labels from Label Proportions

Novi Quadrianto, Alex J. Smola, Tibério S. Caetano, Quoc V. Le.

Year: 2009, Volume: 10, Issue: 82, Pages: 2349−2374


Consider the following problem: given sets of unlabeled observations, each set with known label proportions, predict the labels of another set of observations, possibly with known label proportions. This problem occurs in areas like e-commerce, politics, spam filtering and improper content detection. We present consistent estimators which can reconstruct the correct labels with high probability in a uniform convergence sense. Experiments show that our method works well in practice.