Deep Learning of Representations for Unsupervised
and Transfer Learning
Y. Bengio; JMLR W&CP
27:17–36, 2012.
Abstract
Deep learning algorithms seek to exploit the unknown structure in the
input
distribution in order to discover good representations, often at
multiple levels, with higher-level
learned features defined in terms of lower-level features. The objective
is to make these
higher-level representations more abstract, with their individual
features more invariant to most
of the variations that are typically present in the training
distribution, while collectively
preserving as much as possible of the information in the input.
Ideally, we would like these
representations to disentangle the unknown factors of variation that
underlie the training
distribution. Such unsupervised learning of representations can be
exploited usefully under
the hypothesis that the input distribution
P(
x) is structurally related to some task
of
interest, say predicting
P(
y|x). This paper focuses on the context of
the Unsupervised and
Transfer Learning Challenge, on why unsupervised pre-training of
representations can be
useful, and how it can be exploited in the transfer learning scenario,
where we care
about predictions on examples that are not from the same distribution
as the training
distribution.