P.-L. Chen et al; JMLR W&CP 18:21–60, 2012.
A Linear Ensemble of Individual and Blended Models
for Music Rating Prediction
Track 1 of KDDCup 2011 aims at predicting the rating behavior of users in the Yahoo!
Music system. At National Taiwan University, we organize a course that teams up students to
work on both tracks of KDDCup 2011. For track 1, we ﬁrst tackle the problem by building
variants of existing individual models, including Matrix Factorization, Restricted Boltzmann
-Nearest Neighbors, Probabilistic Latent Semantic Analysis, Probabilistic Principle
Component Analysis and Supervised Regression. We then blend the individual models along with
some carefully extracted features in a non-linear manner. A large linear ensemble that
contains both the individual and the blended models is learned and taken through some
post-processing steps to form the ﬁnal solution. The four stages: individual model building,
non-linear blending, linear ensemble and post-processing lead to a successful ﬁnal solution,
within which techniques on feature engineering and aggregation (blending and ensemble
learning) play crucial roles. Our team is the ﬁrst prize winner of both tracks of KDD Cup
Page last modified on Tue May 29 10:22:57 2012.