Next: Future Directions Up: Summary and Future Directions Previous: Summary and Future Directions

Summary

We have developed a uniform computational model for natural language parsing and generation. It is based on a novel uniform tabular algorithm for parsing and generation from constraint-based grammars, and a new method of grammatical processing called item sharing. On the basis of these methods we have shown how an elegant but practical interleaving of parsing and generation is achieved by a novel incremental monitoring algorithm that is used during natural language production. Implementations of these methods exist and we have given details about the technical realization.

The new uniform tabular algorithm is a generalization of the Earley deduction method introduced by [Pereira and Warren1983]. Although uniformly defined the algorithm is fully driven by the structure of the actual input - a string for parsing and a semantic expression for generation. This task-oriented behaviour is obtained by means of a data-driven selection function (the element to process next is determined on the basis of the current portion of the input) and a data-driven uniform indexing technique. It is uniform in the sense that the same basic mechanism is used for parsing and generation, although parameterized with respect to the information used for indexing lemmas. More precisely, in the case of parsing, lemmas are indexed using string information and in the case of generation semantic information is used to access lemmas. The kind of index causes completed information to be placed in different state sets. Using this mechanism we can benefit from table-driven generation, similar to that of parsing. For example, using a semantics-oriented indexing mechanism during generation massive redundancies are avoided, because once a phrase is generated, we are able to use it in any position within the sentence.

Since the only relevant parameter our algorithm has with respect to parsing and generation is the difference in input structures, the basic differences between parsing and generation are simply the different input structures. This seems to be trivial; however, our approach is the first uniform algorithm that is able to adapt its behaviour dynamically to the data, achieving a maximal degree of uniformity of parsing and generation. None of the current uniform approaches exhibit such a degree of uniformity. Moreover - in some sense as a side-effect - we have shown that it is superior to the semantic-head driven generation algorithm developed by [Shieber et al.1990] which is currently the most prominent algorithm used for grammatical generation.

There is evidence that comprehension and generation are not just inverses, but that they are related to each other also at the processing level. For example, the human mechanism also involves some monitoring of the output and it is widely accepted that this is performed by making use of the comprehension mechanism. However, it has been an open question as to how such a behaviour can practically be realized in computer systems. We have paid serious attention to that problem, and we obtained as an answer that systematic pursuit of uniformity in natural language processing - as followed in this thesis - achieves the necessary preconditions for a practical interleaving of parsing and generation.

The specific results we have obtained are twofold. First, we have shown that the uniform tabular algorithm can straightforwardly be extended in order to share partial results in both directions. We have called this property item sharing, because items (i.e., the internal representation of partial results) computed in one direction are automatically made accessible for the other direction as well, results computed during parsing are usable during generation and vice versa. Second, we have specified an incremental monitoring mechanism in order to demonstrate how an interleaved approach can contribute to the solution of complex problems. The underlying mechanism used during monitoring can be denoted as an incremental generate-parse-revise strategy: substrings produced during generation are parsed to test whether they lead to ambiguities, detected ambiguities are handled by means of revision. This mechanism has been integrated into the uniform algorithm in an elegant and practical way.

In this thesis we have only considered self-monitoring and revision in depth at the grammatical level. However, by showing in detail how the uniform model contributes to the solution of this problem, we were able to demonstrate that uniformity is in fact of important practical relevance for natural language systems. By considering uniformity and interleaving of natural language parsing and generation under a strictly computational view we have broken new ground. In the next section we will discuss further important research directions for uniform processing.

Next: Future Directions Up: Summary and Future Directions Previous: Summary and Future Directions

Guenter Neumann
Mon Oct 5 14:01:36 MET DST 1998