Path Length Bounds for Gradient Descent and Flow
Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas; 22(68):1−63, 2021.
Abstract
We derive bounds on the path length ζ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions. Among other results, we prove that: (a) if the iterates are linearly convergent with factor (1−c), then ζ is at most O(1/c); (b) under the Polyak-Kurdyka-\L ojasiewicz (PKL) condition, ζ is at most O(√κ), where κ is the condition number, and at least ˜Ω(√d∧κ1/4); (c) for quadratics, ζ is Θ(min and in some cases can be independent of \kappa; (d) assuming just convexity, \zeta can be at most 2^{4d\log d}; (e) for separable quasiconvex functions, \zeta is {\Theta}(\sqrt{d}). Thus, we advance current understanding of the properties of GD and GF curves beyond rates of convergence. We expect our techniques to facilitate future studies for other algorithms.
© JMLR 2021. (edit, beta) |