An Empirical Study of the Use of Relevance Information in Inductive Logic Programming
Ashwin Srinivasan, Ross D. King, Michael E. Bain; 4(Jul):369-383, 2003.
Abstract
Inductive Logic Programming (ILP) systems construct models
for data using domain-specific background information.
When using these systems, it is typically assumed that
sufficient human expertise is at hand to rule out
irrelevant background information. Such irrelevant information can, and
typically does, hinder an ILP system's search for good models.
Here, we provide evidence that if expertise
is available that can provide a partial-ordering
on sets of background predicates in terms of
relevance to the analysis task, then this can be used to good effect by
an ILP system. In particular, using data from biochemical domains,
we investigate an incremental strategy of including sets of predicates
in decreasing order of relevance. Results obtained suggest that:
(a) the incremental approach identifies, in substantially less time,
a model that is comparable in predictive accuracy to that
obtained with all background information in place; and
(b) the incremental approach using the relevance ordering performs
better than one that does not (that is, one that adds sets
of predicates randomly).
For a practitioner concerned with use of ILP,
the implication of these findings are two-fold:
(1) when not all background information can be used
at once (either due to limitations of the ILP system, or
the nature of the domain) expert assessment of the relevance
of background predicates can assist substantially
in the construction of good models; and
(2) good "first-cut" results can be obtained quickly by a simple exclusion of
information known to be less relevant.
[abs][pdf][ps.gz][ps]