Minimal Kernel Classifiers
Glenn M. Fung, Olvi L. Mangasarian, Alexander J. Smola;
3(Nov):303-321, 2002.
Abstract
A finite concave minimization algorithm is proposed for constructing
kernel classifiers that use a minimal number of data points
both in generating and characterizing a classifier.
The algorithm is theoretically justified on the basis
of linear programming perturbation theory and a leave-one-out
error bound as well as effective computational results
on seven real world datasets. A nonlinear rectangular kernel
is generated by systematically utilizing
as few of the data as possible both in training
and in
characterizing a nonlinear separating surface.
This can result in substantial reduction in kernel data-dependence
(over 94% in six of the seven public datasets tested on) and with test
set correctness equal to that obtained by using a
conventional support vector machine classifier that depends on many more data points. This
reduction in data dependence results in a much faster classifier that
requires less storage.
To eliminate data points, the proposed approach
makes use of a novel loss function, the "pound" function ()
#, which
is a linear combination of the 1-norm and the step function that measures
both the magnitude and the presence of any error.
[abs]
[pdf]
[ps.gz]
[ps]