Active Learning and Experimental Design with SVMs
C.-H. Ho, M.-H. Tsai & C.-J.
Lin; JMLR W&CP 16:71–84, 2011.
Abstract
In this paper, we consider active learning as a procedure of iteratively performing
two steps: first, we train a classifier based on labeled and unlabeled data. Second, we query labels
of some data points. The first part is achieved mainly by standard classifiers such as SVM and
logistic regression. We develop additional techniques when there are very few labeled data. These
techniques help to obtain good classifiers in the early stage of the active learning procedure. In
the second part, based on SVM or logistic regression decision values, we propose a framework to
flexibly select points for query. We find that selecting points with various distances to the
decision boundary is important, but including more points close to the decision boundary
further improves the performance. Our experiments are conducted on the data sets of
Causality Active Learning Challenge. With measurements of Area Under Curve (AUC) and
Area under the Learning Curve (ALC), we find suitable methods for different data
sets.
Page last modified on Wed Mar 30 11:09:38 2011.