Presorting mutual information values

Next: Experiments Up: Discrete variables of arbitrary Previous: Computing co-occurrences

Presorting mutual information values

Our goal is to presort the mutual information values $I_{uv}$ for all $v\prec u$ that do not co-occur with

. The following theorem shows that this can be done exactly as before. Theorem Let be discrete variables such that do not co-occur with (i.e. $u\neq0\;\Rightarrow \;v=w=0$ ) in a given dataset ${\cal D}$ . Let $N_{v0},N_{w0}$ be the number of datapoints for which and respectively, and let $I_{uv},I_{uw}$ be the respective empirical mutual information values based on the sample ${\cal D}$ . Then

$\begin{displaymath} N_{v0} \;>\; N_{w0}\;\;\Rightarrow\;\;I_{uv} \;\leq\;I_{uw} \end{displaymath}$

with equality only if is identically 0. The proof of the theorem is given in the Appendix. The implication of this theorem is that the ACCL algorithm can be extended to variables taking more than two values by making only one (minor) modification: the replacement of the scalar counts $N_{v}$ and $N_{uv}$ by the vectors $N_v^j, j\neq 0$ and, respectively, the contingency tables $N_{uv}^{ij}, i,j\neq 0$ .

**Figure 23:** Running time for the ACCL (full line) and (dotted line) CHOWLIU algorithms versus number of vertices for different values of the sparseness .
$\begin{figure} \centerline{\epsfig{file=figures/acc-time.ps,width=3.1in}} \end{figure}$

**Figure 24:** Number of steps of the Kruskal algorithm versus domain size measured for the ACCL algorithm for different values of .
$\begin{figure} \centerline{\epsfig{file=figures/acc-nk.ps,width=3.1in}} \end{figure}$

Next: Experiments Up: Discrete variables of arbitrary Previous: Computing co-occurrences

Journal of Machine Learning Research 2000-10-19