This paper describes the mixtures-of-trees model, a probabilistic
model for discrete multidimensional domains. Mixtures-of-trees
generalize the probabilistic trees of [
Chow, Liu 1968]
in a different and complementary direction to that of Bayesian networks.
We present efficient algorithms for learning mixtures-of-trees
models in maximum likelihood and Bayesian frameworks.
We also discuss additional efficiencies that can be
obtained when data are ``sparse,'' and we present data
structures and algorithms that exploit such sparseness.
Experimental results demonstrate the performance of the
model for both density estimation and classification.
We also discuss the sense in which tree-based classifiers
perform an implicit form of feature selection, and demonstrate
a resulting insensitivity to irrelevant attributes.