Ehud Reiter (University of Aberdeen; email@example.com)
There is a saying that ``a picture is worth 1000 words''. It is certainly true that in some cases graphics is a much better mechanism for presenting information than text; in other cases, however, text seems to be more effective than graphics. In this paper I briefly summarise some of the issues involved in determining whether text or graphics is better in a particular context. This is intended to be a discussion paper, and I encourage responses or contributions from other people.
In very general terms, the decision as to whether text or graphics should be used is influenced by the following factors:
The first factor, constraints imposed by the delivery medium or user population, is intended to cover pragmatic reasons why a medium might be impossible or impractical. For example, if the information is delivered via a slow network link or to a text-only Web browser such as lynx, then text probably will be used. On the other hand, if the information is being delivered to a wide user population which does not share a common language, or to a user population which cannot read (such as small children), then graphics will probably be the preferred medium. I will not further discuss such constraints here, since I believe they are not of great interest to the research community.
The second factor, the type of information being communicated, has been discussed in the research community, although often from the perspective of individual systems. For example, Feiner and McKeown (1990) discuss media selection in the COMET system, which produces instructions for maintaining and repairing equipment. COMET communicates location information and physical attributes with graphics, simple actions with a combination of text and graphics, and abstract actions purely with text. Roth and Hefley (1993) discuss media choice in the SAGE system, which produces explanations of quantitative modeling systems; SAGE uses text to communicate information about causality and abstract concepts, as well as small heterogenous data sets, but graphics to communicate large homogeneous data sets.
From a more theoretical perspective, Stenning and Oberlander (1995) argue that graphical presentations do not always make clear what information is intentionally being communicated. and what information should be ignored by the user. For example (and this is my example, not theirs), if we used a set of pictures to show a user how to fix a flat tire, he or she might not realise that it was not necessary to stand exactly where the person in the picture was standing. A related point is made by Marks and Reiter (1990), who point out that readers will make inferences from the position of nodes in a node-link diagram (for example, users may assume that nodes which are physically close together are semantically related), even if the intention of the diagram is just to communicate which nodes are linked to which other nodes.
In order to get the discussion going, I will go out on a limb and make the following far-too-general claims
Another factor influencing whether text or graphics should be used is the expertise of the user, where a general finding seems to be that graphical presentations of information are often better suited to domain experts than to novices. Partly this is because all speakers of a given language, regardless of their domain expertise, possess a shared vocabulary of tens of thousands of words, plus a very rich set of syntactic, semantic, and pragmatic rules for combining these words into sentences. On the other hand, most people do not possess a similarly rich knowledge of graphics. The `man in the street' may know the meaning of a few hundred icons (such as traffic signs), and be aware of a few general syntactic rules (for example, if 2 objects are linked with an edge, they are related in some fashion), but this is much poorer than his knowledge of language. This means that a text-based information-presentation system for people who are not domain experts can build on a rich existing knowledge of language, while a graphics-based presentation system must explain everything from scratch.
Another issue is that many graphical genres have conventions which novices may not be aware of, even if they have learned the formal structure of the graphical system. Such conventions allow experts to rapidly identify chunks of the diagram, without having to resort to first-principles reasoning, For example (this comes from Petre (1995)), in electronic circuit diagrams, bistable flip-flops are usually drawn as two vertically aligned NAND gates. Hence, any time an expert sees two vertically aligned NAND gates, he or she is likely to assume that they form a flip-flop, without checking the wiring to verify this. This allows the expert to understand a diagram rapidly; a novice, who does not know the conventions, may take much longer to understand a diagram. A related point is made by Tufte (1983), who points out that statistical graphics can be extremely misleading to people who are not used to interpreting them.
Again I will go out on a limb and make some far-too-general claims in order to get the discussion going:
The final factor mentioned above was the manner in which users are expected to use the information; in other words, its communicative purpose. This has not (to the best of my knowledge) been extensively discussed in the research community (although see Roth and Hefley (1993)), but I believe it is important. In particular (and once again I am going out on a limb to get a discussion started), I believe that graphics is well-suited to analytical tasks, because graphical presentations allow human users to exploit their visual pattern-recognition abilities; and also to `marketing' tasks, where a primary communicative goal is simply to grab and keep the user's attention. Textual presentations (including tables as well as natural-language text), on the other hand, are better suited for communicated information precisely, and for instructional contexts where a user is expected to memorise the information.
I will end my note here, and encourage interested readers to respond!
Feiner, S. and McKeown, K. (1990). Coordinating Text and Graphics in Explanation Generation. Proceedings of AAAI-1990, vol 1, pages 442-449.
Marks, J. and Reiter, E. (1990). Avoiding Unwanted Conversational Implicatures in Text and Graphics. Proceedings of AAAI-1990, vol 1, pages 450-456.
Petre, M. (1995). Why Looking isn't always Seeing: Readership Skills and Graphical Programming. Communications of the ACM 38:33-44.
Roth, S. and Hefley, W. (1993). Intelligent Multimedia Presentation Systems: Research and Principles. In M. Maybury (ed.), Intelligent Multimedia Interfaces, pages 13-58. AAAI Press.
Stenning, K. and Oberlander, J. (1995). A Cognitive Theory of Graphical and Linguistic Reasoning: Logic and Implementation. Cognitive Science 19:97-140.
Tufte, E (1983). The Visual Display of Quantitative Information. Graphics Press.
Please send all comments and questions concerning this article to the discussion moderator firstname.lastname@example.org