MATE Deliverable D1.1

Supported Coding Schemes

(Chiba University)

Coding book:

Instead of having one scheme the Chiba scheme consists of three different coding schemes according to

These schemes are applied for any tasks.

The coding book  will be available by WWW in the near future. But it is written in Japanese. The outline of the work is reported at the First International Conference on Language Resources and Evaluation, Spain, May 1998.

"Standardising Annotation Schemes for Japanese Discourse",  A. Ichikawa, et. al.

Number of annotators:
10 coders (WG members)

Number of annotated dialogues:

Task Dialogues Utterances
Schedule management 14 509
Route direction 131
Telephone shopping 4 277
Tourist information 1 68

Evaluations of scheme:

  A B C
alpha 0.577 0.680 0.612

Underlying task:
route direction, scheduling, telephone shopping, tourist information

List of phenomena annotated:

  1. U:  hai, etto, shinkanseN waNji hatsu desu ka.
  2. (I)  (What's the departure time of the bullet train?)
  3. S:  e, jyuu nana ji haN ni natte orimasu.
  4. (R)  (It's 17:30.)
  5. U:  hai.
  6. (F)  (I see.)

Mark-up language:
The markup language is as follows:

    <Utt Id=0000 Utterance_unit=open_dialogue Speker="S"

    Topic=scheduling Depth_of_segment=2 >

     [Well] <then> please start.

Discourse markers are tagged in the transcriptions. Utterance unit and discourse unit are described in SGML manner.

Existence of annotation tool:
A modification of DAT (DRI) is used. It includes prediction of utterance unit tag. (Accuracy of prediction is about 70 % in open test.)

Contact person:
Masato Ishizaki (masato@jaist.ac.jp)

