Coding book:
sls-ftp.lcs.mit.edu/pub/multiparty/coding_schemes/nakatani
Author: Christine H. Nakatani,
Barbara J. Grosz, David D. Ahn and Julia Hirschberg (1995)
Title: "Instructions for Annotating
Discourses". Technical Report Number TR-21-95. Center for Research in Computing
Technology, Harvard University: Cambridge, MA.
Number of annotators:
A team of six annotators was
trained to use the manual for the project on the Boston Directions Corpus
at Harvard University, involving the authors of the manual. The annotators
did not have any linguistic backgrounds, intentionally. "Naive" users have
been desired to provide "unbiased" codings (compared to codings done by
the researchers themselves for example).
Number of annotated dialogues:
Approx. 72 direction-giving
monologues have been coded, from four different speakers. The coding was
done while listening to the speech. The language is American English. The
monologues have been broken down into intermediate prosodic phrases for
discourse coding.
Evaluation of scheme:
The scheme is the result of
augmenting and refining instructions given to students in discourse classes;
in this sense, there has been evaluation of the scheme. Statistical/quantitative
evaluation is done but not published yet.
Underlying task:
The scheme is not meant to be
limited to any particular task or purpose. However, it is mainly applied
to direction-giving. The scheme is not meant for conversational speech
without clear communicative intentions.
List of phenomena annotated:
The scheme aims at annotating
discourse
segment purposes, that is, the reasons why the speaker utters a given
discourse segment. The purpose of each segment is described at the start
of the segment, on a line that begins with a simple WHY? tag. Purposes
are figured out by making reference to annotators own background knowledge
and general intelligence. Annotators are advised to use the most possible
specific expression that suitably describes the speaker's purpose, and
thus to prefer a description like "Give tip on removing vein under faucet"
instead of an expression like "Explain rinsing/washing of vein". In general,
one segment is associated with one purpose, but a segment can be related
to many purposes, and vice versa.
Purposes corresponding to different discourse segments are hierarchically organized, from the WHY? for the discourse overall to the smaller subsidiary purposes of smaller segments. Segments range from the whole dialogue/discourse to sentences; adverbial/prepositional phrases (called mini-segments) that supply additional information are not labeled with a WHY? tag. There are no rules about the number of subsegment allowed within one segment. Segments/purposes at the same level do not need to be at the same level of detail or about the same kind of information. Segments/purposes at the same level may not be directly related to each other, but must be related to their immediately higher segment/purpose. Two consecutive phrases may or may not share the same purpose: if they do, their purposes belong to the same level; if they don't, this means that one of the two purposes is subsidiary to the other and thus one of the two phrases starts an embedded subsegment. Discontinuous segments (as for digressions, asides, elaborations etc., which suspend the current topic flow) appear as a subsegment within a bigger segment "wrapped" around it.
Examples:
WHY? Teach new cook how to
make stuffed sole
We're
going to be making sole, stuffed with shrimp mousse.
WHY? Explain steps of initial preparation of ingredients and equipment
WHY? Identify ingredients
In the small bag is the sole and the shrimp.
And there are ten small sole fillets and there's half a pound of medium
shrimp
WHY? Instruct new cook to get equipment ready.
Okay, and you're going to need a blender to make the mousse. So you should
get your blender out.
WHY? Explain how to make shrimp mousse
Okay,
the first thing we want to do, we should do is we should make the shrimp
mousse.
WHY? Tell how to prepare shrimp
And, what you want to do is you want to take the shrimp, okay and you want
to peel and devein them.
WHY? Describe peeling
Okay, what you do is you peel the outer shell off.
WHY? Describe deveining processes
WHY? Tell how to find vein by cutting
Okay, and then you hold the shrimp and you run a knife down the outside,
it's like the back of the shrimp, okay, just cut in about a sixteenth of
a inch.
What you'll see, is there'll be a vein, there.
WHY? Tell how to remove vein
Okay, it, it'll either be a pinkish vein or a black vein.
WHY? Explain removal of pink vein
Okay, if there's a pink vein you can just pull it out,
WHY? Explain removal of dark vein
Okay, if there's a dark colored vein, you can, you wash that out. Run your
thumb down one of your
fingers down the back to get that out.
WHY? Give tip on removing vein under faucet
And you know, what I usually do is, to rinse or wash out the vein, I just
hold the shrimp under
the sink, under the uh, the faucet. I cut it and then I put it under the
faucet.
WHY? Explain how to blend shrimp and other ingredients to make mousse
Okay
now um, let's see, take the shrimp and place the shrimp in the blender.
...
WHY? Describe how to prepare
sole for "stuffing"
Now, get out a large casserole,
like a nine by twelve.
...
Now you want to place five of
the um, the sole fillets side by side in the baking dish.
WHY? Explain how to "stuff"
sole with shrimp mousse
Okay, and now you take the shrimp
mousse and you uh, you place a fifth of the mousse on each of the fillets.
...
Use all the mousse. Spread it
evenly over each fillet.
Mark-up language:
N.b.'s mark-up language. This
is not fully compliant with SGML, but a program is distributed with Nb
that converts Nb-annotated files into standard SGML files.
Existence of annotation tools:
N.b. Tcl/Tk interface by G.
Flammia.
Usability:
Boston Directions Project, also
in the work on the intonational correlates of discourse structure (Barbara
Grosz, Julia Hirschberg, Christine Nakatani).
Contact person:
Christine Nakatani (chn@research.att.com)
Last Modification: 27.8.1998 by Marion Klein