C.-H. Ho, M.-H. Tsai & C.-J.
Lin; JMLR W&CP 16:71–84, 2011.
Active Learning and Experimental Design with SVMs
In this paper, we consider active learning as a procedure of iteratively performing
two steps: ﬁrst, we train a classiﬁer based on labeled and unlabeled data. Second, we query labels
of some data points. The ﬁrst part is achieved mainly by standard classiﬁers such as SVM and
logistic regression. We develop additional techniques when there are very few labeled data. These
techniques help to obtain good classiﬁers in the early stage of the active learning procedure. In
the second part, based on SVM or logistic regression decision values, we propose a framework to
ﬂexibly select points for query. We ﬁnd that selecting points with various distances to the
decision boundary is important, but including more points close to the decision boundary
further improves the performance. Our experiments are conducted on the data sets of
Causality Active Learning Challenge. With measurements of Area Under Curve (AUC) and
Area under the Learning Curve (ALC), we ﬁnd suitable methods for diﬀerent data
Page last modified on Wed Mar 30 11:09:38 2011.