This leads us to add a specific mechanism for facing such problems: refinement. Whenever a rule is learned, exceptions to this rule are systematically searched. The result of this algorithm is a set of rules where each of them is associated with a set of exceptions. In the first part of this article, we will evaluate the usefulness of this device and will show that it improves results when learning linguistic structures. We will also show that, with the use of refinement, some traditional problems occurring when learning set of rules such as threshold determination fall if one uses appropriate prior knowledge.
In a second part, we explore a second way for improving the efficiency of the system by using prior knowledge. Since Natural Language is a strongly structured object, it may be important to investigate whether structural linguistic knowledge can help to make natural language learning more efficiently and accurately. The utility of (prior) knowledge has been shown with inductive systems [see][]pazzani92,cardie99integrating. This article presents some experiments, trying to answer this question: What kind of linguistic knowledge can improve learning?
This article is articulated as follows: the inductive learning system ALLiS is described and a first estimation using no prior knowledge is proposed. Results of this experiment without linguistic knowledge will be used as the baseline in order to appraise the effect of the prior knowledge. This linguistic prior knowledge is then detailed, and we will discuss its (positive) effect from a computational viewpoint as well as from a qualitative viewpoint. The system has been applied to the shared task of the CoNLL'00 workshop. We then provide a quantitative and qualitative analysis of these results. Finally we compare our algorithm with related systems, especially FOIDL.
A description of the Upenn tagset used along the article is given Appendix 8.