Next: Overview Up: No Title Previous: Conclusion

A Uniform Tabular Algorithm for Parsing and Generation

In this chapter a practical uniform algorithm for processing of reversible natural language grammars is presented that can be used for efficient parsing and generation of constraint-based grammars without the need of compilation. Hence, we call this approach an interpreter for reversible grammars.

The new approach follows the paradigm of ``natural language processing as deduction'' as introduced by [Pereira and Warren1983] for the case of parsing and extended by [Shieber1988] to the case of generation. As described in Shieber's work parsing as well as generation can be thought of as the constructive proving of a string's grammaticality or of the existence of a string that matches some given criterion (e.g., a given semantic expression). Under this deductive view the difference between parsing and generation rests in which information is given as premises and what the goal is to be proved. Thus ``parsing and generation can be viewed as two processes engaged in by a single parameterized theorem prover for the logical interpretation of the formalism'' ([Shieber1988], page 614). This view is the starting point of the new approach developed in this thesis.

In order to model parsing and generation using the same basic control logic in a task specific manner, we will develop the new uniform tabular algorithm along the following dimensions:

data-driven selection function
uniform indexing technique
item sharing

The first basic idea is to use the same set of inference rules for parsing and generation - basically we use the Earley deduction proof procedure as introduced in [Pereira and Warren1983] - but to use a data-driven selection function, in the sense that the element to process next is determined on the basis of the current portion of the input. In the case of parsing this is the string, and for generation it is the semantic expression.

This enables us, for example, to obtain a left to right control regime in the case of parsing and a semantic head driven regime in the case of generation when processing the same grammar by means of the same underlying algorithm.

A second novel idea of the new approach is to use an uniform indexing mechanism for the retrieval of already completed subgoals (i.e., lemmas). It is uniform in the sense that the same basic mechanism is used for parsing and generation, but parameterized with respect to the information used for indexing lemmas. More precisely, in the case of parsing, lemmas are indexed using string information and in the case of generation semantic information is used to access lemmas. The kind of index causes completed information to be placed in the different state sets. Using this mechanism we can benefit from a table-driven view of generation, similar to that of parsing. For example, using a semantics-oriented indexing mechanism during generation massive redundancies are avoided, because once a phrase is generated, we are able to use it in a variety of places.

Based on the uniform indexing mechanism we present a novel method of grammatical processing which we call the item sharing method. The basic idea here is that partial results computed during one direction (e.g., parsing) are automatically made available for the other direction (e.g., generation), too. Since now items are shared by both directions we call them shared items. We show how the uniform tabular algorithm is easily extended to make use of shared items. The usefulness of the item sharing approach will emerge when the method of interleaving parsing and generation is introduced. We demonstrate that both the uniform tabular algorithm and the item sharing method make the exploration of such an approach a worthwhile venture.

Furthermore, the uniform tabular algorithm is embedded into an agenda control mechanism, such that new lemmas are first inserted into an agenda. The agenda is structured as a priority queue to store lemmas that have not yet been added to the table. Since lemmas are added to the table according to their priority, we can easily model depth-first, breadth-first and even preference-based strategies.

These aspects together allow us to consider parsing and generation as the same uniform process which is capable of efficiently controlling the space of possible constructions in a task specific data-oriented manner. On one hand this enables us to obtain a stronger goal-directed generation behaviour as the one proposed by [Shieber et al. 1989], by taking advantage of the Earley deduction method. On the other hand we are able to characterize parsing and generation in a fairly balanced way without the loss of efficient properties. Hence, we avoid the complications or restrictions that [Shieber1988] and [Gerdemann1991] are confronted with, because of their ``parsing oriented'' view of generation. Moreover, the new uniform tabular algorithm together with the item sharing approach makes interleaving of parsing and generation not only plausible but also practical. Furthermore, because we can easily adapt our uniform algorithm for more sophisticated control techniques (e.g., the use of preferences) it is more amenable to future refinements than most of the algorithms described so far.

Since the only relevant parameter our algorithm has with respect to parsing and generation is the difference in input structures - a string for parsing and a semantic expression for generation - the basic differences between parsing and generation are simply the different input structures. This seems to be trivial, however our approach is the first uniform algorithm that is able to adapt itself dynamically to the data, achieving a maximal degree of uniformity for parsing and generation under a task-oriented view.

Next: Overview Up: No Title Previous: Conclusion

Guenter Neumann
Mon Oct 5 14:01:36 MET DST 1998