Approximate Submodularity and its Applications: Subset Selection, Sparse Approximation and Dictionary Selection
Abhimanyu Das, David Kempe; 19(3):1−34, 2018.
Abstract
We introduce the submodularity ratio as a measure of how “close” to submodular a set function f is. We show that when f has submodularity ratio γ, the greedy algorithm for maximizing f provides a (1−e−γ)-approximation. Furthermore, when γ is bounded away from 0, the greedy algorithm for minimum submodular cover also provides essentially an O(logn) approximation for a universe of n elements. As a main application of this framework, we study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. We analyze the performance of widely used greedy heuristics; in particular, by showing that the submodularity ratio is lower-bounded by the smallest 2k-sparse eigenvalue of the covariance matrix, we obtain the strongest known approximation guarantees for the Forward Regression and Orthogonal Matching Pursuit algorithms. As a second application, we analyze greedy algorithms for the dictionary selection problem, and significantly improve the previously known guarantees. Our theoretical analysis is complemented by experiments on real-world and synthetic data sets; in particular, we focus on an analysis of how tight various spectral parameters and the submodularity ratio are in terms of predicting the performance of the greedy algorithms.
© JMLR 2018. (edit, beta) |