next up previous contents
Next: Conclusion Up: Incremental Interleaving of Parsing Previous: Implementation

Properties of the Incremental Method

The incremental monitoring method can be seen as an additional restriction to the uniform algorithm to keep track only those computed partial results which do not force ambiguities. Note that monitoring is only triggered by the completion rules and will only be performed on consistent structures. The effect of monitoring is that the uniform algorithm will only consider a subset of possible answers, namely those which are un-ambiguous. If no un-ambiguous string can be produced then the resulting set of answers is empty. However, if the algorithm finds an answer then it is correct. In this sense the monitor just further constraints the set of computable answers for a given semantic expression.

There are two parameters which influence the behaviour of the incremental monitoring strategy: the concrete value of n for the look-back strategy and the degree of the nodes of a derivation tree, which corresponds with the length of the right hand side of the rules. We will call this the branching factor of the grammar. The maximal possible degree of a node will be denoted as maximal branching factor, and corresponds to the rule with the largest number of right-hand side elements defined in a grammar.

Recall, that the variable n refers to the number of sisters of a selected element of an active item which have to be considered as context. Now, suppose we have chosen 1 as the value of n, i.e., we are following a look-back(1) strategy. Furthermore, assume we have two grammars tex2html_wrap_inline10603 and tex2html_wrap_inline10607 which are weakly equivalent, and where the maximal branching factor of tex2html_wrap_inline10603 is 2 and that of tex2html_wrap_inline10607 is some integer m greater than 2. Thus, tex2html_wrap_inline10603 defines binary structures and tex2html_wrap_inline10607 flat structures (compared wrt. tex2html_wrap_inline10603 ).

For tex2html_wrap_inline10603 a look-back(1) strategy means, that in each case where the incremental monitor mechanism is activated the extended string determined on the basis of the contextual string of some active item is identical with the whole string of the constituent defined by the active item. The reason is that when using a grammar defining binary structures, each rule has maximally two right hand side elements, say tex2html_wrap_inline12677 and tex2html_wrap_inline12679 . Only when both have been deduced, monitoring will take place, because otherwise no contextual information would be available. Suppose, that tex2html_wrap_inline12677 is completed before tex2html_wrap_inline12679 , then the string of tex2html_wrap_inline12677 will serve as the contextual string of tex2html_wrap_inline12679 . The extended string is then the concatenation of both strings, but this means that it is just the string spanned by the whole item. This implies, that all possible ambiguities will be detected and that if the incremental monitor generates an utterance, then this utterance is unambiguous.

For tex2html_wrap_inline10607 a look-back(1) strategy means in general, that only a substring of the string defined by a constituent will be taken into account when building an extended string. But then, as we already showed in in section 5.7.2, it is possible, that not all possible ambiguities will be detected. As a consequence, this means that if the incremental monitor generates a string, this string need not necessarily be unambiguous.

Putting both together, we obtain a different result (wrt. the degree of ambiguity of a ``monitored generated string'') using the same value of n, but on grammars which only differ with respect to their maximal branching factor. Of course, if we want to make sure that our algorithm behaves in the same way for grammars with different maximal branching factor, i.e., if it is to guarantee that only unambiguous strings are generated, then we have to choose the maximal branching factor of the grammar as the value for n when performing the look-back strategy.

On the other side, if we have a grammar with a maximal branching factor greater than 2, say m, then following a look-back(n) strategy, with n ;SPMlt; m, would mean that the incremental monitoring strategy would only consider those ambiguities which can be detected with the chosen look-back strategy. However, in that case, there is no guarantee that the resulting string is unambiguous. But note that the value chosen for n directly influences the size of the extended string. This means, that in general, a small value for n would mean less effort for the parser.

The discussion made above directly reveals the problem of determining the appropriate value for the look-back strategy. If we choose the maximal branching factor, then we obtain unambiguous strings (if they actually exist), for the price of high computational effort. On the other side, if we choose a small value for n we reduce the effort but will eventually not obtain an unambiguous string. Furthermore, it cannot be guaranteed that we actually have considered relevant ambiguities.

In order to compromise between computational effort and the degree of resolved ambiguities, we have to consider some additional criteria, which are used to decide whether an ambiguity check should be applied to a newly generated string.

Assumed we have such criterions they can easily be used during monitoring, such that during the call of GET_CONTEXT this information is used firstly to check whether for the passive item ambiguity should take place, and second on each sister ``consumed'' by the look-back strategy the test are applied. Only if the passive item and its sisters fulfill the conditions expressed by these criterions an extended string will eventually be delivered.

This provides the possibility to restrict the application of the monitoring strategy, for instance, on grammar specific information. For example, it would be possible to restrict monitoring only for maximal projections or only for those structures which are known to cause ambiguities (e.g., pp-modifiers, coordinations). In our implementation we have already build in mechanisms that can take into account such additional grammar specific information. However it is a matter of future investigation (primarily on the linguistic side) to achieve meaningful and realistic criterions.


next up previous contents
Next: Conclusion Up: Incremental Interleaving of Parsing Previous: Implementation

Guenter Neumann
Mon Oct 5 14:01:36 MET DST 1998