A Linear Ensemble of Individual and Blended Models
for Music Rating Prediction
P.-L. Chen et al; JMLR W&CP 18:21–60, 2012.
Abstract
Track 1 of KDDCup 2011 aims at predicting the rating behavior of users in the Yahoo!
Music system. At National Taiwan University, we organize a course that teams up students to
work on both tracks of KDDCup 2011. For track 1, we first tackle the problem by building
variants of existing individual models, including Matrix Factorization, Restricted Boltzmann
Machine,
k-Nearest Neighbors, Probabilistic Latent Semantic Analysis, Probabilistic Principle
Component Analysis and Supervised Regression. We then blend the individual models along with
some carefully extracted features in a non-linear manner. A large linear ensemble that
contains both the individual and the blended models is learned and taken through some
post-processing steps to form the final solution. The four stages: individual model building,
non-linear blending, linear ensemble and post-processing lead to a successful final solution,
within which techniques on feature engineering and aggregation (blending and ensemble
learning) play crucial roles. Our team is the first prize winner of both tracks of KDD Cup
2011.
Page last modified on Tue May 29 10:22:57 2012.