## Machine Learning in an Auction Environment

*Patrick Hummel, R. Preston McAfee*; 17(197):1−37, 2016.

### Abstract

We consider a model of repeated online auctions in which an ad
with an uncertain click-through rate faces a random distribution
of competing bids in each auction and there is discounting of
payoffs. We formulate the optimal solution to this
explore/exploit problem as a dynamic programming problem and
show that efficiency is maximized by making a bid for each
advertiser equal to the advertiser's expected value for the
advertising opportunity plus a term proportional to the variance
in this value divided by the number of impressions the
advertiser has received thus far. We then use this result to
illustrate that the value of incorporating active exploration in
an auction environment is exceedingly small.

[abs][pdf][bib]