logo

MATE Deliverable D1.1

Supported Coding Schemes

Responsible Editor: Marion Klein (DFKI)
Niels Ole Bernsen (MIP), Sarah Davies (HCRC), Laila Dybkjær (MIP),
Juanma Garrido (DFE), Henrik Kasch (MIP), Andreas Mengel (IMS),  Vito Pirrelli (ILC),
Massimo Poesio (HCRC), Silvia Quazza (CSELT), Claudia Soria (ILC)



Abstract:

The first step of the MATE project is to define an overall mark-up formalism which is based on the TEI/CES standards. This formalism accommodates the needs of current and emerging coding schemes for the level of prosody, (morhpo-) syntax, co-reference, dialogue acts, communication problems, and cross-level issues. In order to accomplish this one has to observe existing coding schemes. These schemes should have proved their reliability in the way that they have been used in systems by a couple of novice and / or expertise users for annotating a corpus of reasonable size. This report represents a survey of such coding schemes which fulfil this property. These coding schemes are described in detail with regard to their coding book, number of annotators working with it, number of annotated dialogues / segments / utterances, evaluation results, underlying task, list of annotated phenomena, and mark-up language used. Also annotation examples are provided.
 

Keywords:

coding scheme, communication problem, co-reference, cross-level issue, dialogue act, multilevel annotation, morphosyntax, prosody, standardization, tools engineering



 

Executive Summary

This report gives an overview of the state of the art of coding schemes. Schemes for the levels of prosody, morpho-syntax, co-reference, dialogue acts, communication problems, and cross-level issues have been examined. In order to allow an appropriate comparison of schemes guidelines have been developed (s. section 1.3). These guidelines will guide the decision of which coding schemes will be supported by the MATE project and which ones might be of less interest as they lack reliability. The MATE annotation standards are going to be developed on the basis of the results of this report.

A brief overview of the chapters of this report is given below:

Chapter 1 gives a general introduction to the theme, summarizes the projects approach and discusses the guidelines which are used to standardize the retrieval of important information about schemes.

Chapters 2 - 7 present the state of the art of the five different annotation levels which MATE is going to investigate plus cross-level:

Chapter 8 draws conclusions about the scheme comparisons on the different levels and outlines future work.

A detailed list of all schemes under consideration can be found in Annexes.
 

Acknowledgements

We would like to thank Masahiro Araki, Florence Bruneseaux, Sherri Condon, Mark Core, Barbara di Eugenio, Giovanni Flammia, Arne Jönsson, Staffan Larsson, Lori Levin, Christine Nakatani, Joakim Nivre, Laurent Romary, Jacques Terken, Ann Thyme-Gobbel, and Hans de Vreught for providing information on their schemes.
 

Glossary of Terms

Last Modification: 26.8.1998 by Marion Klein