Knut Hartmann, Bernhard Preim and Thomas Strothotte
7.5.99 Ehud Reiter
28.5.99 Knut Hartmann, Bernhard Preim and Thomas Strothotte
C1. Ehud Reiter (University of Aberdeen) (7.5.99):
I think this is a very interesting topic, but I must admit that at times I found the paper a bit frustrating because it left out many details.
In particular, from a Natural-Language Generation perspective, I think the most interesting part of this work is content determination, that is deciding what information to include in captions. But there isn't much detail given about this. I for one would very much like to know
A2. Knut Hartmann, Bernhard Preim and Thomas Strothotte (28.5.99):
Ehud Reiter asked:
1. What kind of graphical manipulations (abstractions) can be performed by the underlying ZOOM ILLUSTRATOR system?
The ZOOM ILLUSTRATOR presents one or two views onto an object, together with labels which contain annotations.
There may be several annotations of an object in different levels of detail. Whenever the user selects an object or its annotation, more detailed object descriptions are presented within the accompanying label, whereas the amount of text in other labels or the number of labels has to be decreased. So the varying amount of text presented in one annotation enforces a rearrangement of all labels. In order to determine the maximal bounding box of all labels, the ZOOM algorithm  is applied. This algorithm preserves the topological relations between all objects in 2D or in 3D whenever the size of objects is changed. One effect of this algorithm is to preserve contextual information while changing the focus of an illustration.
Another consequence of the request about additional information for an object is the application of several techniques of graphical emphasis. Emphasized objects should be clearly visible and recognizable. This could be achieved by altering presentation variables of the object itself, or those of other objects. Frequently, those objects occluding the object to be emphasized are drawn semi-transparently, while the saturation of the object to be emphasized is increased.
Furthermore, geometric manipulations, such as the 3D-Zoom, may be applied to highlight objects. Finally, the system may determine another viewing position using some heuristics, for instance to increase the visible portion of the object to be highlighted.
There is an online description of the ZOOM ILLUSTRATOR system available as well as a gallery of screen-shots, which may be useful to illustrate the ideas presented above. Furthermore, several papers (, , ) discuss various aspects of the ZOOM ILLUSTRATOR system in more detail.
2. What is the relationship between the person choosing graphical manipulations in ZOOM ILLUSTRATOR; the person deciding what kind of caption should be generated (e.g., filling out the GUI in Fig. 3); and the person reading the caption?
3. [unclear description of the] content-determination mechanism
[production rules or schema expansion?]
[bottom-up or top-down?]
In the scenario described in the paper, all these actions are performed by the same person. The configuration of the figure captions content (see Fig. 3) was inspired by the configuration of menus and tool-bars in usual graphical interfaces. The description of all manipulations -- whether initiated by the user or the system -- would lead to long figure captions. So our idea is to sum up the most important manipulations in the figure caption.
To illustrate the generation process, we will frequently refer to the system architecture presented in Fig. 4 and the configuration dialog in Fig. 3. Both visualization components, the graphic and the text display, inform the context expert about changes to the visualization. In technical terms, both modules send events containing information both about the modification as well as the initiator of the modification, which are recorded in the context expert.
The status of the Update Frequency option in Fig. 3 determines when an update, i.e. a regeneration, of figure captions is triggered. These options correspond to an update after every modification, whenever a threshold is reached or according to users explicit request.
The options Information Selection and Select Tracing Item control event filter. Finally, the value of the Information Level option controls which structural elements are activated in the macrostructure presented in Fig. 5.
The macrostructure defines the linear order of the structural elements and triggers their generation using classes of conditional templates. In this process, first of all, a template which conditions holds, is selected. Then, values for the template variables are estimated. Finally, within the lexical mapping, phrases describing (numerical) values are determined.
To sum up, the options in the configuration dialogue together with the predefined macrostructure control content determination, whereas the templates and the lexical mapping of template variables control the content realization.
We hope that this overview of the generation process makes the discussion in Section 7.2 and Section 7.3 somewhat clearer. Nevertheless, the authors will rewrite Section 7 in the final version.
4) I also wondered where the content rules or schematas came from. Did the developers derive them from a corpus analysis? From discussions with domain experts? From user trials with different rules or schematas? From some other source?
At the very beginning of this work, we analyzed the structure of figure captions in several anatomic textbooks, such as the Sobotta atlas see  and  in the references of our paper). Moreover, the structure of other textbooks was studied in order to generalize the results from our analysis of anatomic figure captions. Furthermore, we ask the domain experts, i.e. undergraduates studying anatomy, what they expect to find in figure captions.
Recently, a first evaluation of the interaction techniques used by the ZOOM ILLUSTRATOR for the explanation of special phenomena was carried out (). One goal was to evaluate whether undergraduates studying medicine find figure captions useful for a correct interpretation of anatomic illustrations generated interactively by the ZOOM ILLUSTRATOR.
On a scale between 0 (redundant) and 10 (most indispensable) all 9 probands ranked the usefulness between 9 and 10 (9.8 on average). No other feature of the ZOOM ILLUSTRATOR reached those ratings. Despite of the small number of probands, that is a very impressive result, which clearly emphasizes the importance of figure captions, also in interactive environments.