Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis
Zheng Zhao, Huan Liu;
JMLR W&P 4:36-47, 2008.
Abstract
Feature selection is an effective approach to reducing
dimensionality by selecting relevant original features. In this
work, we studied a novel problem of multi-source feature
selection for unlabeled data: given multiple heterogeneous data
sources (or data sets), select features from one source of interest
by integrating information from various data sources. In essence, we
investigate how we can employ the information contained in multiple
data sources to effectively derive intrinsic relationships that can
help select more meaningful (or domain relevant) features. We
studied how to adjust the covariance matrix of a data set using the
geometric structure obtained from multiple data sources, and how to
select features of the target source using geometry-dependent
covariance. We designed and conducted experiments to systematically
compare the proposed approach with representative methods in our
attempt to solve the novel problem of multi-source feature
selection. The empirical study demonstrated the efficacy and
potential of multi-source feature selection.