As with regular MDPs,
-stationary MDPs can also be
generalized with general environment and agent operators. The
resulting model inherits the advantages of both approaches of
generalization: a broad scale of decision problems can be
discussed simultaneously, while the underlying environment is
allowed to change over time as well. This family of MDPs will be
called generalized
-stationary MDPs or
-MDPs for
short.
Given a prescribed
, a generalized
-MDP is
defined by the tuple
, with
and
,
, if there exists a generalized MDP
such that
. Note
that the last assumption requires that the asymptotic distance of
the corresponding dynamic-programming operator sequence
and
is small.
Note also, that the given definition is indeed a generalization of
both concepts: setting
,
and
for all
yields a generalized MDP, while
setting
and
for all
simplifies to an
-stationary MDP.