Closure-Based Confidence Boost in Association Rules
José L. Balcázar; JMLR W&CP 11:74-80, 2010.
We focus on association rule mining. It is well-known that naive miners end up often providing far too large amounts of mined associations to result actually useful in practice. Many proposals exist for selecting appropriate association rules, trying to measure their interest in various ways; most of these approaches are statistical in nature, or share their main traits with statistical notions.
Alternatively, some existing notions of redundancy among association rules allow for a logical-style characterization and lead to irredundant bases (axiomatizations) of absolutely minimum size. Here we follow up on a study of closure-based redundancy, which, in practice, leads to smaller bases than simpler alternative forms of redundancy, with the proviso that, in principle, they need to be complemented with an implicational basis. One can push the intuition of redundancy further and gain a perspective of the interest of association rules in terms of their "novelty" with respect to other rules. An irredundant rule is so because its confidence is higher than what the rest of the rules would suggest; then, one can ask: how much higher? Among several variants, a recently proposed parameter, the confidence boost, succeeds in measuring a notion of novelty along these lines so that it fits better the needs of practical applications. However, that notion is based on plain redundancy, of relatively limited practical usefulness. Here we extend the confidence boost to closure-based redundancy, paying a small theoretical price to obtain several advantages in practical applications. We describe a rule-mining system implementing this contribution.