Collaborative Filtering Ensemble for Ranking
M. Jahrer & A. Töscher; JMLR W&CP
18:153–167, 2012.
Abstract
This paper provides the solution of the team “commendo” on the Track2 dataset of
the KDD Cup 2011 Dror et al.. Yahoo Labs provides a snapshot of their music-rating database as
dataset for the competition, consisting of approximately 62 million ratings from 250k users on
300k items. The dataset includes hierachical information about the items. The goal of the
competition is to distinguish beteen “High rated” and “Not rated” items of a user. The rating
scale is discrete and ranges from 0 to 100, while a “High” rating is a rating
≥80. The error
measure is the percent of false rated tracks over all users, known as the fractions of
misclassifications. The task is to minimize this error rate, hence the ranking should be
optimized. Our final submission is a blend of different collaborative filtering algorithms
enhanced, with basic statistics. The algorithms are trained consecutively and they are
blended together with a neural network. Each of the algorithms optimizes a rank error
measure.
Page last modified on Tue May 29 10:23:22 2012.