next up previous contents
Next: Processing of Empty Heads Up: A Uniform Tabular Algorithm Previous: Extending the Un-indexed Version

A Parsing and Generation Example

 

We will use the following grammar fragment to illustrate the behaviour of the new indexed version:gif

tabular3862

The phrasal backbone of this grammar is context-free. Thus we implicitly assume that strings are represented as difference lists which are simply concatenated. For parsing we can assume that the value of is bound to tex2html_wrap_inline12073 and for generation the value is bound to SEM.

This grammar fragment has the nice property, that for the string ``sieht Peter mit Maria'' two readings ``sehen(peter,mit(maria))'' and ``sehen(peter(mit(maria))'' will be analyzed and for the reading ``sehen(peter,mit(maria))'' the two strings ``sieht Peter mit Maria'' and ``sieht Maria mit Peter'' are generated. Thus the example illustrates very well how we can reuse completed structures in parsing as well as in generation.

The figure 4.13 illustrates the indexing mechanism for parsing the string ``sieht Peter mit Maria''. We assume that a lemma counter is used that enumerates the lemmas just created (starting from 0) and that the agenda mechanism selects tasks in a depth-first manner. We also count the items that have been placed in some item set starting by 1. The lemma counter will be attached to an item as a prefix, and the item counter as its suffix.

Figure 4.13 graphically outlines the trace through the example. To make things more readable, we are using only the initials of each word of the string. Thus ``sPmM'' abbreviates the string ``sieht Peter mit Maria''. The sequence in which item sets are created is indicated by using a counter starting from 0. Thus the index of the initial item set is ``sPmM0''. The counter will then be used as an abbreviation for the item set indices in an item.

We also show the status of the agenda and the current selected task. We also show those items which represent alternatives but are suspended in an extra row ``Item of alternative'', to make the depth-first strategy more readable. The trace can be read as follows (using a telegram-like style):

Add task 0 (the start item) to the initial item set tex2html_wrap_inline12075 and remove 0 from the agenda. Task 0 is now item 1. 1 predicts two new task, 1 and 2, from which 2 is the current task. Add task 2 as item 2 to tex2html_wrap_inline12075 . Item 2 scans the word ``sieht'' and the resulting reduced task 3 is added to the agenda (creating tex2html_wrap_inline12079 ); task 3 is added as item 3 to the item set tex2html_wrap_inline12079 ; item 3 predicts two new tasks 4 and 5 from which 5 is determined as the current task and added as item 4 to tex2html_wrap_inline12079 ; scanning of item 4 fails (the difference lists don't match), so the predictor adds two new tasks 6 and 7 to the agenda (note that 6 is a recursive call to 4; but this task will be blocked, so that this possible recursive loop is avoided); task 7 is added as item 6 to tex2html_wrap_inline12079 ; on item 6 the scanners applies (scanning ``peter'') and the reduced task 9 is added to the agenda (creating tex2html_wrap_inline12087 ); task 9 as item 8 predicts task 10; task 10 as item 9 scans ``mit'' and the resulting task 11 is added (creating tex2html_wrap_inline12089 ); task 11 as item 10 predicts task 12; task 12 as item 11 scans ``maria'' which creates the passive item 12; item 12 completes task 10 which is added as task 15 to the agenda; task 15 as item 13 completes item 9; the generated task 16 is added as item 14 to tex2html_wrap_inline12079 ; item 16 reduces item 2 which causes reduction of the start item 0; the first reading ``sieht [Peter mit Maria]'' is found; now the second reading ``sieht [Peter] [mit Maria]'' can benefit from previous precomputation; it also scans ``sieht'', but can directly be reduced with item 8 and item 15 to yield item 21 and finally item 22; since at this point the agenda is empty, the whole process stops;

Note that the selection function ``simulates'' a leftmost selection strategy, since in each case it is the leftmost element which essential feature is instantiated. Furthermore, it is important to note that each item set fulfills the equivalence condition as described above, i.e., the inference rules construct well-formed item sets with respect to their definition.

   figure3881
Figure 4.13: A trace through parsing of the string ``sieht Peter mit Maria''.

We now demonstrate how the algorithm is used in the generation mode. Note that we need to use the path SEM as essential feature. This is the only requirement to let the indexed version of the uniform algorithm to run for generation in an efficient manner.

Figure 4.14 shows the trace of the semantic expression ``sehen(Peter,mit(Maria))''. We make a similar abbreviations for the indices. Thus ``s(P,m(M))'' abbreviates the semantic expression ``sehen(Peter,mit(Maria))''. We also assume that the agenda control processes tasks in a depth-first manner. The item sets are to be read as follows:

We start by adding task 0 (the start item) to the agenda. Since, it is also the current task, it is added as item 1 into the initial state set tex2html_wrap_inline12119 ; item 1 predicts two new tasks 1 and 2 from which 2 is determined as the current task; task 2 is added as item 2 to state set tex2html_wrap_inline12119 which causes scanning of lexical entry ``sieht'' and the reduced task 3 is added to the agenda (this also creates item set tex2html_wrap_inline12123 ); task 3 as item 3 predicts two new tasks 4 and 5 from which task 5 is added as item 4 to tex2html_wrap_inline12123 ; item 4 scans ``peter'' and adds new task 6 to the agenda; task 6 as item 5 reduces item 3 which generates task 7 (task 7 creates tex2html_wrap_inline12127 ); task 7 is added as item 6 to tex2html_wrap_inline12127 and predicts new task 8; further processing yields a new item set tex2html_wrap_inline12127 which results after some more steps task 13 which is added as item 15 to tex2html_wrap_inline12127 ; item 13 completes item 7 which creates task 14; task 14 is added as item 12 to tex2html_wrap_inline12135 which causes reduction of the start item and the first paraphrase ``sieht Peter mit Maria''; the next tasks to process are 10 and 4 (in that order) which do not force creation of new tasks; so the next task is task 1 which is added as item 16 to tex2html_wrap_inline12135 ; after scanning and active completion with item 3 a new task 16 is added to the agenda which is then added as item 17 to tex2html_wrap_inline12127 ; item 17 creates a new task 17 which is added as item 18 to tex2html_wrap_inline12123 ; by active completion of item 17 with item 6 the task 18 is created and added as item 19 to tex2html_wrap_inline12135 ; this item leads to the second paraphrase ``sieht mit Maria Peter'', and since the agenda is now empty, we stop!

   figure4119
Figure 4.14: A trace through generation of ``sehen(Peter,mit(Maria))''

Note that the selection function ``simulates'' the semantic-head first selection function, although coincidentally in all cases the head element is located in leftmost position. Furthermore, note how the second paraphrase is generated by reusing the PP ``mit Maria'' (item 13) and the NP ``Peter'' (item 6) already computed during the generation of the first paraphrase. Since the item sets are indexed by means of semantic information, there is no problem in placing these strings at different string positions as for the first paraphrase. In this example, the item sets are created in sequentially because of the depth-first strategy. If we had used a breadth-first strategy, the item sets tex2html_wrap_inline12123 and tex2html_wrap_inline12127 would have been created simultaneously. In the above simple examples we have abstracted away from most details concerning the grammatical ``realism'' of this fragment. However, in our implementation we have tested the indexing mechanism not only with the example grammar given in appendix A, but also with an adaption of the English DCG grammar given in [Pereira and Shieber1987]. We have also processed constructions that make use of empty elements, such as in the case of gap-threading analysis of topicalization, wh-movement, and relativization using this grammar without problems.

However, as known from the work of e.g., [Shieber et al. 1989], [Russell et al. 1990] and [VanNoord1993], there is one kind of construction where empty heads are involved which can cause termination problems for any known generation strategy. In the next section we discuss this problem in detail and give our solution to it.


next up previous contents
Next: Processing of Empty Heads Up: A Uniform Tabular Algorithm Previous: Extending the Un-indexed Version

Guenter Neumann
Mon Oct 5 14:01:36 MET DST 1998