## DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

*Aryan Mokhtari, Alejandro Ribeiro*; 17(61):1−35, 2016.

### Abstract

This paper considers optimization problems where nodes of a
network have access to summands of a global objective. Each of
these local objectives is further assumed to be an average of a
finite set of functions. The motivation for this setup is to
solve large scale machine learning problems where elements of
the training set are distributed to multiple computational
elements. The decentralized double stochastic averaging gradient
(DSA) algorithm is proposed as a solution alternative that
relies on: (i) The use of local stochastic averaging gradients.
(ii) Determination of descent steps as differences of
consecutive stochastic averaging gradients. Strong convexity of
local functions and Lipschitz continuity of local gradients is
shown to guarantee linear convergence of the sequence generated
by DSA in expectation. Local iterates are further shown to
approach the optimal argument for almost all realizations. The
expected linear convergence of DSA is in contrast to the
sublinear rate characteristic of existing methods for
decentralized stochastic optimization. Numerical experiments on
a logistic regression problem illustrate reductions in
convergence time and number of feature vectors processed until
convergence relative to these other alternatives.

[abs][pdf][bib]