next up previous
Next: Refinement Up: The ALLiS Algorithm Previous: Rule Generation and Evaluation


The Search Space

Since information contained in each node may not be enough in order to find the category of each node, we will use additional information, namely the context of each node, which is traditional in Natural Language Processing.

Once a rule has been evaluated, and if its accuracy is not high enough, the expansion step is set off: new literals are added to the rule in order to constrain it. By adding new literals, the purpose is to reduce the number of negative examples covered by the current rule. One main problem in this kind of approach is to determine which literal has to be added. FOIL, the system described in [Quinlan(1990)], uses a measure for selecting them, the information-gain. We do not implement this selection process and do not impose any criteria for selecting literals, which will be improved in the next versions of ALLiS.

Figure 1: Organization of the search space. First, local information is used, and the space is then expanded.

We partially order the way the search space is explored (Figure 1). First, local information is used: the search space is reduced to the node itself. If the accuracy is still not high enough, the immediate adjacent nodes of the current node are added into the space, until the maximum number of neighbors authorized is reached or the accuracy is high enough. Concerning the feature of each node, the more general features are first added (C, the POS tag of the word, then W the word itself). This requires then knowledge about the feature used, knowledge which is not always available in a more general case.


next up previous
Next: Refinement Up: The ALLiS Algorithm Previous: Rule Generation and Evaluation
Hammerton J. 2002-03-13