MATE Deliverable D1.1

Supported Coding Schemes

Flammia's Coding Scheme
(Spoken Language Systems Group, Laboratory for Computer Science,  Massachusetts Institute of Technology)

Coding book:
Author: Giovanni Flammia
Title: Instructions for Annotating Segments in Dialogues

Number of annotators:
16 graduate students with some knowledge of computer science and linguistics

Number of annotated dialogues:
25, with an average number of dialogue turns of 40 and with 29 to 120 utterances per dialogue.
The language of the dialogues is American-English.

Evaluations of scheme:

Underlying Task:
Information-seeking dialogues; telephone conversations between customers and operators of the BellSouth Movies Now service - a telephone number that people can call to get information about current movie schedules in Atlanta.

List of phenomena annotated:
Structural/functional phenomena, such as the division of dialogues into segments, each one concerning a given topic. A segment is thus defined as a sequence of two or more dialogue turns (including at least one utterance by each one of the speakers), where one relevant piece of information is exchanged between conversation participants. Relevance is defined in terms of necessity to the continuation of the task defined in the dialogue. Flammia's coding scheme does not provide categories with which segments should be annotated; instead, annotators are free to choose what they consider to be the most appropriate description for a given segment. However, some speech act tags that are exemplified in Flammia's approach are the following: Request, Response, Acknowledge, Accept, Reject, Repeat, Confirm, and Question Confirm. A decision procedure concerning how to carve segments out of dialogues is specified, together with 'rules of the thumb' regarding possible correspondences between surface forms and segments boundaries. Discourse phenomena such as greetings, introductions, offers to help, back-channel phenomena, prompts for continuation, thanks and closings are not recognized as having a relevant status for segmentation. Only segments directly dealing with  task-relevant information are signaled and annotated.


Mark-up language:
N.b.'s mark-up language. This is not fully compliant with SGML, but a program is distributed with Nb that converts Nb-annotated files into standard SGML files.

Existence of annotation tools:
N.b. Tcl/Tk interface by G. Flammia.

Information not available.

Contact person
Giovanni Flammia (flammia@sls.lcs.mit.edu)

Last Modification: 27.8.1998 by Marion Klein