Next: 4. LSVM for Nonlinear
Up: Lagrangian Support Vector Machines
Previous: 2. The Linear Support
3. LSVM (Lagrangian Support Vector Machine) Algorithm
Before stating our algorithm we define two matrices to simplify
notation as follows:
|
(10) |
With these definitions the dual problem (8) becomes
|
(11) |
It will be understood that within the LSVM Algorithm,
the single time that is computed at the outset
of the algorithm, the SMW identity (1) will be used.
Hence only an
matrix is inverted.
The LSVM Algorithm is based directly on the Karush-Kuhn-Tucker
necessary and sufficient optimality conditions
[15, KTP 7.2.4, page 94] for the dual problem (11):
|
(12) |
By using the easily established identity between any two real
numbers (or vectors) and :
|
(13) |
the optimality condition (12) can be written in the following
equivalent form for any positive :
|
(14) |
These optimality conditions lead to the following very simple iterative
scheme which constitutes our LSVM Algorithm:
|
(15) |
for which we will establish global linear convergence from any starting
point under the
easily satisfiable condition:
|
(16) |
We impose this condition as
in all our experiments,
where is the parameter of our SVM formulation (7).
It turns out, and this is the way that led us to this
iterative scheme, that the optimality condition (14),
is also the necessary and sufficient condition for the unconstrained minimum of
the implicit Lagrangian [21] associated with the
dual problem (11):
|
(17) |
Setting the gradient with respect to of this convex and differentiable
Lagrangian to zero gives
|
(18) |
or equivalently:
|
(19) |
which is equivalent to the optimality condition (14)
under the assumption that is positive and not an eigenvalue
of .
We establish now the global linear convergence of the iteration (15)
under condition (16).
Algorithm 1
LSVM Algorithm & Its Global Convergence
Let
be the symmetric positive
definite matrix defined by (
10) and let (
16) hold. Starting with an arbitrary
, the iterates
of (
15) converge to the unique solution
of (
11) at the linear rate:
|
(20) |
Proof Because is the solution of (11),
it must satisfy the optimality condition (14) for any .
Subtracting that equation with from the iteration
(15) premultiplied by and taking norms gives:
|
(21) |
Using the Projection Theorem [1, Proposition 2.1.3] which
states that the distance between any two points in is not less than the
distance between their projections on any convex set (the nonnegative orthant
here) in , the above equation gives:
|
(22) |
All we need to show now is that
. This follows
from (16) as follows. Noting the definition (10)
of and letting , , denote the nonnegative
eigenvalues of , all we need is:
|
(23) |
or equivalently:
|
(24) |
which is satisfied under the assumption (16).
We give now a complete MATLAB [26] code of LSVM which is
capable of solving problems with millions of points using only native
MATLAB commands.
The input parameters, besides , and of (10),
which define the problem, are: itmax, the maximum number of iterations
and tol, the tolerated nonzero error in
at
termination. The quantity
bounds from above:
|
(25) |
which measures the violation of the optimality criterion (14).
It follows [20] that
also bounds
, and by (9) it also bounds
and
,
where
is the unique solution of the primal
SVM (7).
Code 2
LSVM MATLAB M-File
function [it, opt, w, gamma] = svml(A,D,nu,itmax,tol)
% lsvm with SMW for min 1/2*u'*Q*u-e'*u s.t. u=>0,
% Q=I/nu+H*H', H=D[A -e]
% Input: A, D, nu, itmax, tol; Output: it, opt, w, gamma
% [it, opt, w, gamma] = svml(A,D,nu,itmax,tol);
[m,n]=size(A);alpha=1.9/nu;e=ones(m,1);H=D*[A -e];it=0;
S=H*inv((speye(n+1)/nu+H'*H));
u=nu*(1-S*(H'*e));oldu=u+1;
while it<itmax & norm(oldu-u)>tol
z=(1+pl(((u/nu+H*(H'*u))-alpha*u)-1));
oldu=u;
u=nu*(z-S*(H'*z));
it=it+1;
end;
opt=norm(u-oldu);w=A'*D*u;gamma=-e'*D*u;
function pl = pl(x); pl = (abs(x)+x)/2;
Next: 4. LSVM for Nonlinear
Up: Lagrangian Support Vector Machines
Previous: 2. The Linear Support
Journal of Machine Learning Research