....1
Unless otherwise noted, $ \Vert.\Vert$ denotes the max-norm.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... increases;2
Note that the convergence of an infinite product implies that the terms converge to one.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (E-learning,3
Capital letter E is used to distinguish E-learning from internet based concepts using prefix lower case letter `e'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....4
Note that $ E(x,y^d)$ depends on both $ \pi^E$ and $ \pi^A$. When no ambiguity may arise we will not explicitly show these dependencies.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... tracking5
The term, `velocity field tracking', may represent the underlying objective of speed field tracking better.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...Szepesvari97Neurocontroller,Szepesvari97Approximate.6
Sign-properness imposes conditions on the sign but not on the magnitude of the components of the output of the approximate inverse dynamics.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... satisfied.7
Justification of this assumption requires techniques of ordinary differential equations and is omitted here. See also [Barto(1978)].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....8
Note that the condition on $ P(x,.,y)$ is a kind of Lipschitz-continuity.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... arm.9
The parameters for SARSA were taken from the work of [Aamodt(1997)] and can be considered near-optimal for the SARSA implementation, which was also taken from the same source.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... `on'.10
Note that the optimal value function is not available and the norm was computed versus the last state of the experiment.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... distributions11
$ P(X^-)$ abbreviates $ P(x\in X^-) = \sum_{x\in X^-} P(x)$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.