Stochastic Unsupervised Learning on Unlabeled Data
C. Liu, J. Xie, Y. Ge H. Xiong;
JMLR W&CP 27:111–122, 2012.
Abstract
In this paper, we introduce a stochastic unsupervised learning method
that was used
in the 2011 Unsupervised and Transfer Learning (UTL) challenge. This
method is developed to
preprocess the data that will be used in the subsequent classification
problems. Specifically, it
performs
K-means clustering on
principal components instead of raw data to remove the impact
of noisy/irrelevant/less-relevant features and improve the robustness
of the results. To alleviate
the overfitting problem, we also utilize a stochastic process to combine
multiple clustering
assignments on each data point. Finally, promising results were
observed on all the test data sets.
Indeed, this proposed method won us the second place in the overall
performance of the
challenge.