We consider the performative aspect of a speech act to be limited to an update of the discourse, including models of intentions. This is in contrast with one commonly held view that speech acts are direct incarnations of actions. Continuing this line of thought, a speech act may be seen as a function mapping a context onto a context [Levinson1983], again the context including the mental states of the participants in the conversation. This allows us to clearly separate the actions invoked by a speech act leading to an update of the discourse and the execution of the application-specific tasks. Consequently, we have a domain-independent formulation of dialogue strategy in terms of discourse update as an advantage of our system.
The information provided by some speech act may contribute to the information available in the discourse in different ways. First of all, incoming information may be compatible with the information available in the discourse and increase the specificity towards a goal. A typical case would be the answer of a clarification question. Second, new information may be compatible with the intention of the speaker, but incompatible with the information established in the discourse. A speech act of this kind constitutes a repair. Third, information may be incompatible with the intention of the speaker and possibly the information in the discourse which indicates a subdialogue. In other words, the informational relation between the speech act and the dialogue state determines partly the way of updating the discourse. Since the representations of the speech acts are constructed by unification of feature structures in function of the parse tree, lexical information can be projected up to the speech act level in case where lexical information already constrains the type of the speech act.
The relations between information in the discourse and the intentions of the user help us to infer a hierarchical structure of discourse. Each level in the hierarchy consists of a list of possibly underspecified feature structures and references to levels below the current level. There are no predefined dialogue structures. Rather, the structure is inferred as information enters the system. If a new speech act is classified as opening a subdialogue, a new level below the current one is created. If a communicative goal is reached, the current level is closed and the level above the current one becomes the current level again.
Comparing to [Grosz and Sidner1986], we equate somewhat simplistically the structure of discourse with the hierarchical representation on the semantic level, while the intentional structure is expressed by the possibly underspecified representations in the intentional states of the user and the system. The focus of attention is limited to the current level of discourse and the levels accessible towards the top.
It is important to note that while the procedures to update the discourse and to infer the dialogue structure may rely on domain-specific knowledge, the formulation of the clauses does not. Instead, the discourse update may be expressed in terms of subsumption and compatibility of different representations. The specification of the discourse update remains thus domain-independent.