Random Sets Approach and its Applications
Vladimir Nikulin; JMLR W&CP 3:65-76, 2008.
Abstract
The random sets approach is heuristic in nature and has been inspired
by the growing speed of computations. For example, we can consider a large
number of classifiers where any single classifier is based on a relatively
small subset of randomly selected features or random sets of features. Using
cross-validation we can rank all random sets according to the selected criterion,
and use this ranking for further feature selection. Another application of
random sets was motivated by the huge imbalanced data, which represent significant
problem because the corresponding classifier has a tendency to ignore patterns
with smaller representation in the training set. Again, we propose to consider
a large number of balanced training subsets where representatives from both
patterns are selected randomly. The above models demonstrated competitive
results in two data mining competitions.