Convergence Rates for the Stochastic Gradient Descent Method for Non-Convex Objective Functions

Benjamin Fehrman; Benjamin Gess; Arnulf Jentzen

We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of iterates to non-attracted regions. The applicability of the results to simple objective functions arising in machine learning is shown.

Convergence Rates for the Stochastic Gradient Descent Method for Non-Convex Objective Functions

Abstract