Fast Stagewise Sparse Factor Regression

Kun Chen; Ruipeng Dong; Wanwan Xu; Zemin Zheng

Sparse factorization of a large matrix is fundamental in modern statistical learning. In particular, the sparse singular value decomposition has been utilized in many multivariate regression methods. The appeal of this factorization is owing to its power in discovering a highly-interpretable latent association network. However, many existing methods are either ad hoc without a general performance guarantee, or are computationally intensive. We formulate the statistical problem as a sparse factor regression and tackle it with a two-stage “deflation + stagewise learning” approach. In the first stage, we consider both sequential and parallel approaches for simplifying the task into a set of co-sparse unit-rank estimation (CURE) problems, and establish the statistical underpinnings of these commonly-adopted and yet poorly understood deflation methods. In the second stage, we innovate a contended stagewise learning technique, consisting of a sequence of simple incremental updates, to efficiently trace out the whole solution paths of CURE. Our algorithm achieves a much lower computational complexity than alternating convex search, and it enables a flexible and principled tradeoff between statistical accuracy and computational efficiency. Our work is among the first to enable stagewise learning for non-convex problems, and the idea can be applicable in many multi-convex problems. Extensive simulation studies and an application in genetics demonstrate the effectiveness and scalability of our approach.

Fast Stagewise Sparse Factor Regression

Abstract