Next: Consistent Dependency Networks
Up: Dependency Networks for Inference,
Previous: Dependency Networks for Inference,
Introduction
The Bayesian network has proven to be a valuable tool for encoding,
learning, and reasoning about probabilistic relationships. In this
paper, we introduce another graphical representation of such
relationships called a dependency network. The representation
can be thought of as a collection of regressions or classifications
among variables in a domain that can be combined using the machinery
of Gibbs sampling to define a joint distribution for that domain. The
dependency network has several advantages and disadvantages with
respect to the Bayesian network. For example, a dependency network is
not useful for encoding causal relationships and is difficult to
construct using a knowledge-based approach. Nonetheless, there are
straightforward and computationally efficient algorithms for learning
both the structure and probabilities of a dependency network from
data; and the learned model is quite useful for encoding and
displaying predictive (i.e., dependence and independence)
relationships. In addition, dependency networks are well suited to
the task of predicting preferences--a task often referred to as
collaborative filtering--and are generally useful for
probabilistic inference, the task of answering probabilistic queries.
In Section 2, we motivate dependency networks from the
perspective of data visualization and introduce a special case of the
graphical representation called a consistent dependency network. We
show, roughly speaking, that such a network is equivalent to a Markov
network, and describe how Gibbs sampling is used to answer
probabilistic queries given a consistent dependency network. In
Section 3, we introduce the dependency network in its
general form and describe an algorithm for learning its structure and
probabilities from data. Essentially, the algorithm consists of
independently performing a probabilistic classification or regression
for each variable in the domain. We then show how procedures closely
resembling Gibbs sampling can be applied to the dependency network to
define a joint distribution for the domain and to answer probabilistic
queries. In addition, we provide experimental results on real data
that illustrate the utility of this approach, and discuss related
work. In Section 4, we describe the task of collaborative
filtering and present an empirical study showing that dependency
networks are almost as accurate as and computationally more attractive
than Bayesian networks on this task. Finally, in
Section 5, we describe a data visualization tool based on
dependency networks.
Next: Consistent Dependency Networks
Up: Dependency Networks for Inference,
Previous: Dependency Networks for Inference,
Journal of Machine Learning Research,
2000-10-19