Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation
Han Xiao (Technical University of Munich) and Thomas Stibor (Technical
University of Munich);
JMLR W&P 13:63-78, 2010.
Abstract
Collapsed Gibbs sampling is a frequently applied method to approximate
intractable integrals in probabilistic generative models such as
latent Dirichlet allocation. This sampling method has however the
crucial drawback of high computational complexity, which makes it
limited applicable on large data sets. We propose a novel dynamic
sampling strategy to significantly improve the efficiency of collapsed
Gibbs sampling. The strategy is explored in terms of efficiency, convergence
and perplexity. Besides, we present a straight-forward parallelization
to further improve the efficiency. Finally, we underpin our
proposed improvements with a comparative study on different scale
data sets.