## Asymptotic behavior of Support Vector Machine for spiked population model

*Hanwen Huang*; 18(45):1−21, 2017.

### Abstract

For spiked population model, we investigate the large dimension
$N$ and large sample size $M$ asymptotic behavior of the Support
Vector Machine (SVM) classification method in the limit of
$N,M\rightarrow\infty$ at fixed $\alpha=M/N$. We focus on the
generalization performance by analytically evaluating the angle
between the normal direction vectors of SVM separating
hyperplane and corresponding Bayes optimal separating
hyperplane. This is an analogous result to the one shown in Paul
(2007) and Nadler (2008) for the angle between the sample
eigenvector and the population eigenvector in random matrix
theorem. We provide not just bound, but sharp prediction of the
asymptotic behavior of SVM that can be determined by a set of
nonlinear equations. Based on the analytical results, we propose
a new method of selecting tuning parameter which significantly
reduces the computational cost. A surprising finding is that SVM
achieves its best performance at small value of the tuning
parameter under spiked population model. These results are
confirmed to be correct by comparing with those of numerical
simulations on finite-size systems. We also apply our formulas
to an actual dataset of breast cancer and find agreement between
analytical derivations and numerical computations based on cross
validation.

[abs][pdf][bib]