Next: The Convergence of the
Up: Convergence Theorems in Generalized
Previous: Convergence Theorems in Generalized
Let
be an arbitrary state-space and denote by
the set of value functions over
(i.e., the set of bounded
functions), and let
be an arbitrary contraction mapping with (unique)
fixed point
.
Let
be
a sequence of stochastic operators. The second argument of
is intended to modify the first one, in order to get a better
approximation of
. Formally, let
be an arbitrary value
function and let
.
is said to
approximate
at
with probability one over
, if
uniformly over
.
Theorem A.1 (Szepesvári and Littman)
Let the sequence of random operators
![$ T_t$](img84.gif)
approximate
![$ T$](img31.gif)
at
![$ V^*$](img30.gif)
with probability one uniformly over
![$ X$](img10.gif)
. Let
![$ V_0$](img98.gif)
be an
arbitrary value function, and define
![$ V_{t+1} = T_t(V_t, V_t)$](img105.gif)
. If
there exist functions
![$ 0 \leq F_t(x) \leq 1$](img106.gif)
and
![$ 0 \leq G_t(x)
\leq 1$](img107.gif)
satisfying the conditions below with probability one, then
![$ V_t$](img93.gif)
converges to
![$ V^*$](img30.gif)
with probability one uniformly over
![$ X$](img10.gif)
:
- for all
and all
,
- for all
and all
,
- for all
,
converges to zero uniformly in
as
increases; and,
- there exists
such that for all
and large enough
,
The proof can be found in [Szepesvári and Littman(1996)]. We cite
here the lemma, which is the base of the proof, since our
generalization concerns this lemma.
Lemma A.2
Let
![$ X$](img10.gif)
be an arbitrary set,
![$ 0 \le \gamma < 1$](img23.gif)
and consider the
sequence
where
![$ x\in X$](img109.gif)
,
![$ \left\Vert w_1\right\Vert < C < \infty$](img295.gif)
with probability one for
some
![$ C>0$](img296.gif)
,
![$ 0 \le G_t(x) \le 1$](img297.gif)
and
![$ 0 \le G_t(x) \le 1$](img297.gif)
for all
![$ t$](img22.gif)
, and
![$ \limsup_{t\to\infty} h_t \to 0$](img298.gif)
w.p.1. Assume that for
all
![$ k$](img299.gif)
,
![$ \lim_{n \to \infty} \prod_{t=k}^{n} G_t(x) = 0$](img300.gif)
uniformly in
![$ x$](img16.gif)
w.p.1 and
![$ F_t \le \gamma(1-G_t(x))$](img301.gif)
w.p.1. Then
![$ \left\Vert w_t\right\Vert $](img302.gif)
converges to 0 w.p.1 as well.
Next: The Convergence of the
Up: Convergence Theorems in Generalized
Previous: Convergence Theorems in Generalized