Divide-and-Conquer for Debiased $l_1$-norm Support Vector Machine in Ultra-high Dimensions

Heng Lian; Zengyan Fan

$1$-norm support vector machine (SVM) generally has competitive performance compared to standard $2$-norm support vector machine in classification problems, with the advantage of automatically selecting relevant features. We propose a divide-and-conquer approach in the large sample size and high-dimensional setting by splitting the data set across multiple machines, and then averaging the debiased estimators. Extension of existing theoretical studies to SVM is challenging in estimation of the inverse Hessian matrix that requires approximating the Dirac delta function via smoothing. We show that under appropriate conditions the aggregated estimator can obtain the same convergence rate as the central estimator utilizing all observations.

Divide-and-Conquer for Debiased $l_1$-norm Support Vector Machine in Ultra-high Dimensions

Abstract