Building Believable Synthetic Characters

Bruce Blumberg and Christopher Kline

Synthetic Characters Group, MIT Media Laboratory
20 Ames Street, Cambridge, MA 02139
{ckline,bruce}@media.mit.edu


 
 




1  Introduction

One of the most promising intersections of entertainment and AI research is in the creation of believable synthetic characters-simple but complete three-dimensional situated agents who can do and express the right things in a particular situation or scenario. Examples of these types of agents include non-player characters in computer games, digital `extras' in Hollywood movies, and computer-based artificial pets. Often these characters do not need to perform complex reasoning about the world or build intricate plans to achieve difficult goals. Instead they may effectively play out their roles by reacting to internal and external influences in ways that are both predictable and consistent with the scenario for which they were designed.

In the process of learning to build these types of characters we have often found ourselves struggling with two fundamental problems. First, what kinds of properties or qualities have we, as observers, come to expect from a believable character? Second, given these expectations, what are the important characteristics (their `essence') to capture when implementing them?

2  Expectations of a Synthetic Character

To learn how to build believable characters we look back upon the rich history of traditional character animation. When looking at a character brought to life by one of the great masters we know exactly what that character is thinking and feeling at every instant and, while we may not know exactly what it is about to do, we can always call upon our perception of its desires and beliefs to hazard a guess. Even when our guess is wrong, the resulting behavior nearly always ``makes sense''.

Classics like The Illusion of Life (Thomas 1981) explain the art of creating believable characters, which is fundamentally the art of revealing a character's inner thoughts-its beliefs and desires-through motion, sound, form, color and staging. But why do these techniques work? The American philosopher Daniel Dennett believes that they work because, in order to understand and predict the behavior of the animate objects around them, people apply what he calls the intentional stance. The intentional stance, he argues, involves treating these objects as ```rational agents'' whose actions are those they deem most likely to further their `desires' given their `beliefs''' (Dennett 1998).

Desires are the key to identifying with and understanding a character. When we see the wolf look ``longingly'' at Little Red Riding Hood, perhaps licking his chops, we conclude that the wolf is hungry and wants to eat our heroine. How do we arrive at this conclusion? By applying the intentional stance, of course! Why else would he be acting hungry unless he was hungry?

Beliefs are what turn desires into actions, reflecting influences such as perceptual input (``If I see a stream, then I believe I will find water there''), emotional input (``Because I am afraid of that person, I will run away from him''), and learning (``The last time I was in this field I saw a snake, so I am sure he is there today''). We understand the actions of characters by inferring how their beliefs influence the ways they attempt to satisfy their desires.

How can we apply both the insights of skilled animators and knowledge of the intentional stance to build a synthetic character which people find both compelling, in the sense that people can empathize with them, and understandable, in that their actions can be seen as attempts to satisfy their desires given their beliefs? From an engineering standpoint, we can translate these expectations into a short list of functional subsystems necessary to satisfy them, namely: motivational drives, emotions, perception, and action selection.

2.1  Motivational Drives

For a character to appear properly motivated it must continue to work towards satisfying its desires while gracefully handling unexpected situations. For example, a character that is starving may temporarily ignore its hunger in order to flee from an approaching predator; however, once the danger has passed the character should resume searching for food. By biasing action selection towards behaviors that will satisfy the internal needs of the character, motivational drives provide a way to achieve goal-oriented behavior.

Most approaches agree on the general behavior of drives. Most importantly, they are cyclical and homeostatic-positive or negative deviations over time from the base state of `satisfaction' represent under- and over-attention, respectively, to a corresponding desire. These desires can be attended to by the successful execution of attentive behaviors like eating, or by changes in external stimuli, such as temperature fluctuations or interactions with other characters. When unattended to, drives slowly increase over time; the effect of attentive actions is to shift the value of the drive back towards its homeostatic base state.

2.2  Emotions

Emotions bias action selection in much the same way as drives. For example, a character that is angry may be more prone to violent behavior than one who is happy. However, emotions also bias the quality of the character's motion. If the character is sad it should walk sadly; if it is fearful it should reach for objects in a manner which conveys its fear. In this way emotion helps observers to form an empathic bond with the character and makes its behavior appear properly motivated (Thomas 1981).

There is a great volume of literature on the subject of modeling of emotions and other affective phenomena, including so-called `appraisal' theories, autonomic/sub-cognitive approaches, and mixed-models. However varied they may be, these models tend to concede that, instead of increasing slowly over time as do drives, emotions typically exhibit a large impulse response followed by a gradual decay back down to a base state. By altering the decay term and the gains on stimuli one can adjust the magnitude and slope of the impulse response, shaping the characteristic response of the emotion. Adjusting these parameters across the space of emotions is equivalent to shaping the `temperament' of the character. Similarly, by altering the bias term on each emotion predisposes the character to a particular emotional state, setting its `mood'.

It is perfectly appropriate to model the influences of multiple emotions upon internal processes such as action selection, but it is difficult for human observers to visually perceive more than one emotion at a time. This is why animators tend to emphasize the most important emotion of a character, avoiding ``mixed emotions''. Because we are designing characters for humans to interact with, it is important that the character's emotional model to include support some notion of a `dominant' emotion. This emotion is used to parameterize motion and expression, giving the observer insight into the internal desires and beliefs of the character.

2.3  Perception

Fundamentally, a situated, embodied agent needs a way to ``make sense'' of the world in which it is situated. By this we mean two things. First, the character needs a method of sensing the world around it; second, it must have a mechanism for evaluating the salience of incoming sensory information. The combination of a sensory stimulus and its corresponding evaluation mechanism is known as a perceptual elicitor or what ethologists refer to as a releasing mechanism.

Sensory information can be provided to a synthetic character many forms, most of which fall into the three basic categories. Physical devices like the temperature and infrared sensors are typical of ``real-world sensing''. Synthetic vision techniques attempt to extract salient features from a physical scene rendered from the viewpoint of the character. In the quick-and-dirty approach of ``direct sensing'', characters gain information by directly interrogating the world or an object within the world include; computer games often employ this approach.

One of the important contributions of (Blumberg 1996), building on ideas from ethology, is the notion that external perceptual influences must be reduced to a form that is compatible with internal influences such as motivations and emotions. Using a consistent internal ``common currency'' is essential for addressing the issue of behavioral relevance-a piece of rotting food should be as compelling to a starving character as a delicious-looking slice of cake is to a character that has already eaten too much. Given this representational paradigm, opportunistic behavior is simply a side effect of the relative difference in weighting between external and internal influences.

2.4  Action Selection

The last two decades of agent research have seen a shift away from cognitivist `Planning' approaches towards models in which behavior is characterized by the dynamics of the agent-environment interaction. In noisy and dynamic environments, nouvelle AI researchers argue, collections of simple, competing behaviors that are tightly coupled with sensors and actuators can be more effective than complex planning mechanisms, while exhibiting many of the same capabilities.

Inspired by ethological theories of behavior, some systems have expanded this approach by using a hierarchical organization to break complicated tasks down into specialized cross-exclusion groups. Within these groups, mutually-exclusive behaviors compete for dominance, using mutual and lateral inhibition to control arbitration.

Regardless of the particular implementation, the fundamental issues for any action selection scheme to address are those of adequacy, relevance, and coherence (Brooks 1990). Adequacy ensures that the behavior selection mechanism allows the character to achieve its goals. Relevance, as noted above, involves giving equal consideration to both the character's internal motivations and its eternal sensory stimuli, in order to achieve the correct balance between goal-driven and opportunistic behavior. Coherency of action means that behaviors exhibit the right amount of persistence and do not interfere with each other or alternate rapidly without making progress towards the intended goal (i.e., behavioral aliasing).

3  Results

Our experience in building synthetic characters has led us to two important observations. First, there is a high degree of interdependence among subsystems-perception, emotions, and drives influence action selection, and the results of action selection in turn affect the external state of the world and the internal state of the character. Second, each is essentially quantitative. For example, the changing value of emotions and drives indicate the state of internal needs, perceptual elicitors determine the relevance of percepts, and action selection mechanisms choose the most appropriate behavior from among multiple competing ones.

These observations suggest that there is a great deal of common functionality among these subsystems. In fact, most of the functions performed by these subsystems can be seen as simply different semantics applied to the same small set of underlying processes. Consequently, instead of struggling to integrate multiple disparate models for each subsystem, we have constructed a simple, value-based framework that provides these shared constructs. Though it is outside the scope of this text, we have shown that the four components of our framework provide a flexible and powerful means for implementing a variety of models for each of the aforementioned subsystems.

This framework was used to successfully build the many autonomous and semi-autonomous characters in Swamped!, an interactive cartoon experience premiered at SIGGRAPH 98. In this exhibit the participants use a sympathetic interface (Johnson 1998) to influence the behavior of a chicken character, with the intent of protecting the chicken's eggs from being eaten by a hungry raccoon. The raccoon character is arguably the most complex fully autonomous synthetic character built to date, comprised of 84 distinct behaviors influenced by 5 separate motivational drives and 6 major emotions. In addition, the continuously changing emotional state of the raccoon is conveyed through dynamically interpolated character motion and facial expressions.

4  References

Blumberg, B. 1996. Old Tricks, New Dogs: Ethology and Interactive Creatures. Ph.D. thesis, MIT Media Lab.

Brooks, R. 1990. Challenges for Complete Creature Architectures. In From Animals to Animats: Proceedings of the First International Conference on the Simulation of Adaptive Behavior.

Dennett, D. 1998. Brainchildren: Essays on Designing Minds. Cambridge, MA: MIT Press.

Johnson, M., Wilson, A., Blumberg, B., Kline, C., Bobick, A. 1998. Sympathetic Interfaces: Using Plush Toys to Control Autonomous Animated Characters. To appear in Proceedings of CHI 99.

Thomas, F., and Johnson, O. 1981. The Illusion of Life: Disney Animation. New York: Hyperion.

For more information on the Swamped! project and the Synthetic Characters group, please visit http://characters.www.media.mit.edu/groups/characters/


File translated from TEX by TTH, version 1.57.--8vlG8tS4nI--