Stability Bounds for Stationary φ-mixing and β-mixing Processes
Mehryar Mohri, Afshin Rostamizadeh; 11(26):789−814, 2010.
Abstract
Most generalization bounds in learning theory are based on some
measure of the complexity of the hypothesis class used,
independently of any algorithm. In contrast, the notion of
algorithmic stability can be used to derive tight generalization
bounds that are tailored to specific learning algorithms by
exploiting their particular properties. However, as in much of
learning theory, existing stability analyses and bounds apply only
in the scenario where the samples are independently and identically
distributed. In many machine learning applications, however, this
assumption does not hold. The observations received by the learning
algorithm often have some inherent temporal dependence.
This paper studies the scenario where the observations are drawn
from a stationary φ-mixing or β-mixing sequence, a
widely adopted assumption in the study of non-i.i.d. processes that
implies a dependence between observations weakening over time. We
prove novel and distinct stability-based generalization bounds for
stationary φ-mixing and β-mixing sequences. These
bounds strictly generalize the bounds given in the i.i.d. case and
apply to all stable learning algorithms, thereby extending the
use of stability-bounds to non-i.i.d. scenarios.
We also illustrate the application of our φ-mixing
generalization bounds to general classes of learning algorithms,
including Support Vector Regression, Kernel Ridge Regression, and
Support Vector Machines, and many other kernel regularization-based
and relative entropy-based regularization algorithms. These novel
bounds can thus be viewed as the first theoretical basis for the use
of these algorithms in non-i.i.d. scenarios.
[abs]
[pdf][bib]© JMLR 2010. (edit, beta) |