Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

Alexandra Carpentier, Remi Munos ; JMLR W&CP 22: 190-198, 2012.

Abstract

We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. Since usual linear bandit algorithms have a regret in O(K\sqrt{n}), it is in general impossible to obtain a sub-linear regret without further assumption. In this paper we make the assumption that heta is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory to derive an algorithm with a regret bound in O(S\sqrt{n}). We detail an application to the problem of optimizing a function that depends on many variables but among which only a small number of them (initially unknown) are relevant.




Home Page

Papers

Submissions

News

Scope

Editorial Board

Announcements

Proceedings

Open Source Software

Search

Login



RSS Feed

Page last modified on Thu April 26 2012 13:56 2012.

Copyright @ JMLR 2012. All rights reserved.