## On Faster Convergence of Cyclic Block Coordinate Descent-type Methods for Strongly Convex Minimization

*Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, Mingyi Hong*; 18(184):1−24, 2018.

### Abstract

The cyclic block coordinate descent-type (CBCD-type) methods,
which perform iterative updates for a few coordinates (a block)
simultaneously throughout the procedure, have shown remarkable
computational performance for solving strongly convex
minimization problems. Typical applications include many popular
statistical machine learning methods such as elastic-net
regression, ridge penalized logistic regression, and sparse
additive regression. Existing optimization literature has shown
that for strongly convex minimization, the CBCD-type methods
attain iteration complexity of $\mathcal{O}(p\log(1/\epsilon))$, where
$\epsilon$ is a pre-specified accuracy of the objective value,
and $p$ is the number of blocks. However, such iteration
complexity explicitly depends on $p$, and therefore is at least
$p$ times worse than the complexity $\mathcal{O}(\log(1/\epsilon))$ of
gradient descent (GD) methods. To bridge this theoretical gap,
we propose an improved convergence analysis for the CBCD-type
methods. In particular, we first show that for a family of
quadratic minimization problems, the iteration complexity
$\mathcal{O}(\log^2(p)\cdot\log(1/\epsilon))$ of the CBCD-type methods
matches that of the GD methods in term of dependency on $p$, up
to a $\log^2 p$ factor. Thus our complexity bounds are sharper
than the existing bounds by at least a factor of $p/\log^2(p)$.
We also provide a lower bound to confirm that our improved
complexity bounds are tight (up to a $\log^2 (p)$ factor), under
the assumption that the largest and smallest eigenvalues of the
Hessian matrix do not scale with $p$. Finally, we generalize our
analysis to other strongly convex minimization problems beyond
quadratic ones.

[abs][pdf][bib]