Processing math: 100%



Home Page

Papers

Submissions

News

Editorial Board

Special Issues

Open Source Software

Proceedings (PMLR)

Data (DMLR)

Transactions (TMLR)

Search

Statistics

Login

Frequently Asked Questions

Contact Us



RSS Feed

Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits

Lilian Besson, Emilie Kaufmann, Odalric-Ambrym Maillard, Julien Seznec; 23(77):1−40, 2022.

Abstract

We introduce GLRklUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, klUCB, with an efficient, parameter-free, change-point detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLRklUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a O(TAΥTlog(T)) regret in T rounds on some “easy” instances in which there is sufficient delay between two change-points, where A is the number of arms and ΥT the number of change-points, without prior knowledge of ΥT. In contrast with recently proposed algorithms that are agnostic to ΥT, we perform a numerical study showing that GLRklUCB is also very efficient in practice, beyond easy instances.

[abs][pdf][bib]        [code]
© JMLR 2022. (edit, beta)

Mastodon