Lenient Learning in Independent-Learner Stochastic Cooperative Games

Ermo Wei, Sean Luke; 17(84):1−42, 2016.

Abstract

We introduce the Lenient Multiagent Reinforcement Learning 2 (LMRL2) algorithm for independent-learner stochastic cooperative games. LMRL2 is designed to overcome a pathology called relative overgeneralization, and to do so while still performing well in games with stochastic transitions, stochastic rewards, and miscoordination. We discuss the existing literature, then compare LMRL2 against other algorithms drawn from the literature which can be used for games of this kind: traditional (“Distributed”) Q-learning, Hysteretic Q-learning, WoLF-PHC, SOoN, and (for repeated games only) FMQ. The results show that LMRL2 is very effective in both of our measures (complete and correct policies), and is found in the top rank more often than any other technique. LMRL2 is also easy to tune: though it has many available parameters, almost all of them stay at default settings. Generally the algorithm is optimally tuned with a single parameter, if any. We then examine and discuss a number of side-issues and options for LMRL2.

[abs][pdf][bib]




Home Page

Papers

Submissions

News

Editorial Board

Announcements

Proceedings

Open Source Software

Search

Statistics

Login

Contact Us



RSS Feed