Prior Knowledge and Preferential Structures in Gradient Descent Learning Algorithms

Robert E. Mahony, Robert C. Williamson; 1(Sep):311-355, 2001.


A family of gradient descent algorithms for learning linear functions in an online setting is considered. The family includes the classical LMS algorithm as well as new variants such as the Exponentiated Gradient (EG) algorithm due to Kivinen and Warmuth. The algorithms are based on prior distributions defined on the weight space. Techniques from differential geometry are used to develop the algorithms as gradient descent iterations with respect to the natural gradient in the Riemannian structure induced by the prior distribution. The proposed framework subsumes the notion of "link-functions".

[abs] [pdf] [ps.gz] [ps]