next up previous contents
Next: Further Important Directions Up: Future Directions Previous: Future Directions

Application of Explanation-Based Learning for Efficient Processing of Constraint-based Grammars

Explanation-based learning (EBL) is a technique through which an intelligent system can learn by observing examples. An EBL system derives justified generalizations from training instances on the basis of declarative background knowledge. As a method, EBL performs four different learning tasks: generalization, chunking, operationalization and justified analogy [Ellman1989]. Typically, the purpose of EBL is to produce a description of a concept that enables instances of that concept to be recognized efficiently [Minton et al. 1989]. More fundamentally, EBL is a method for improving problem solving performance through experience. From this perspective, EBL is also of cognitive relevance, since it can be used to explain why humans tend to phrase ideas in the same way most of the time by adapting to a collection of idioms or prototypical constructions.

In [Neumann1994] I have described the application of EBL to efficient parsing of constraint-based grammars. The idea is to generalize the derivations of training instances created by normal parsing automatically and to use these generalized derivations (also called templates) during the run-time mode of the system. In the case that a template can be instantiated for a new input, no further grammatical analysis is necessary. The approachgif is not restricted to the sentential level but can also be applied to arbitrary sub-sentential phrases, i.e., it is possible to handle substrings of an input by templates. Therefore, the EBL method can be interleaved straightforwardly with normal processing to get back flexibility that otherwise would be lost. In the paper mentioned above we have shown how this interleaving is obtained by using an agenda-based Earley style parser.

For those strings which can be completely processed using EBL we have achieved a speed up of factor thirty to fifty compared with the time required for full parsing with the large German grammar of the DISCO system [Uszkoreit et al. 1994], to which the method has been integrated. However, because the parser can also be ``initialized'' with templates instantiated for substrings of an input, the speed is also enormously increased for input that cannot completely be covered by the EBL method.

Since we assume that the training phase is driven by a representative corpus the EBL method can also be used as the basis for sub-language approaches, for example, in order to train a system to use only those constructions which are most predictable in a specific domain. Thus, for this domain the system would have less coverage but would be able to process the sub-language very efficiently.

The basic idea of applying EBL in the context of constraint-based natural language grammars is due to [Rayner1988] and [Samuelsson and Rayner1991]. Samuelsson Samuelsson:94 has further improved these methods by showing how EBL can be applied to a competence grammar and a representative training set of input sentences by extracting a learned grammar. The learned grammar is complied into LR parsing tables and a special LR parser enables very much faster parsing than the original one does. The important point of Samuelsson's approach is, that he replaces the competence grammar with the performance grammar. Thus, his approach could be seen as a kind of corpus-based grammar compilation. Hence, integration with normal processing is not supported.

Although our method shares some commonalities with that of [Samuelsson1994], there are substantial differences, most notably the interleaving with normal processing and its potential of ``reversibility''.

Up to now, we have only applied EBL to parsing. However, with our uniform tabular algorithm it is now also possible to apply the same method to generation, and to interleave EBL and normal generation in the same way as we do it for parsing. Note that after the generalization of a derivation tree, the resulting template is stored in a discrimination net using the generalized sequence of morphological analyses of the training instance as path expression. Therefore, templates can economically be stored and efficiently be retrieved. For generation, we would use a generalized description of the input semantics as index expression for the retrieval of the templates, which would have been computed in the same manner as during parsing.

Moreover, since we are using a reversible grammar, using the uniform algorithm with EBL means that the same set of templates could be accessed either from morphological or semantic information. This means that we also could extend the item sharing approach to yield a kind of template sharing approach.

At the DFKI we have begun to extend the EBL method described in [Neumann1994] exactly in that direction. The most basic problem for generation is to find a meaningful way of ``generalizing semantic information''. In the current implementation, for instance, we are abstracting away from some morphological features of each word, e.g., stem, number, tense.gif However, it is yet an open question, which semantic information should best be generalized in order to obtain a similar behaviour.

A further important line of research will be to incorporate statistical methods in order to control the size and retrieval of the discrimination tree. A possible strategy is to attach preference values to the edges of the discrimination tree. These preference values can be determined automatically based on the frequency of successful retrieval and in dependence of the point of time of the last successful retrieval. The discrimination tree then can be seen as a self-organizing data-structure. In a similar way alternative templates can be organized to implement a kind of ``preferred template first'' strategy.


next up previous contents
Next: Further Important Directions Up: Future Directions Previous: Future Directions

Guenter Neumann
Mon Oct 5 14:01:36 MET DST 1998