Design Tidbits

back Tidbits Overview

Tools for Design and Visualization

Simplification

Interaction Design

Screen Design

Hierarchies

Visual Design

Design Process

Web, Web Applications, Web Design

Performance

Miscellany

 

 

Narrative User Interfaces

Concepts | Examples | Glossary | References

By Gerd Waloszek, SAP AG, SAP User Experience – Updated: November 5, 2009

Imagine that you enter a booth at a trade show and are welcomed by a synthetic character on a computer screen and offered assistance. The character does not simply talk to you in a one-way fashion but has a real conversation with you. It directs you to indicate your interests in a book in front of the screen and shows videos on the requested topics. Finally it asks you to hand over your business card so that you may receive further information. When you leave the booth accompanied by a farewell greeting, you may wonder what on earth you just have encountered at this booth. Well, you encountered a glimpse into the future of user interfaces, a future that is called "Narrative User Interfaces." (This scenario is based on the IZA prototype from Fraunhofer IDG, see below.)

Narrative user interfaces are based on the storytelling paradigm and set out to revolutionize the way people interact with computers. They promise to ultimately make computers accessible for everyone. Today's graphical user interfaces, even though they have opened the computer to the masses, have reached their limits. Many people have problems with using them and with exploiting their full capabilities. As software becomes more and more complex and powerful, the situation is getting even worse – that is the reason why Prof. Thome of the University of Würzburg, Germany speaks of "complexware" instead of software. The fact that the functionality of software applications is to a large degree underutilized is not a minor issue; it has an enormous economic impact. As an example, business processes that do not run as efficiently as they could waste a lot of money. Narrative interfaces, the research field of Prof. Encarnação from Fraunhofer IDG, combined with Prof. Thome's demand for integrating explanation and guidance capabilities into computers, may help to utilize the immense computing power of future computers and to unleash their real power and benefits – computers that are usable and understandable for everyone, without recourse to manuals, computers that also explain what they do and why, and what they can do for us.

This article provides an introduction to narrative user interfaces. It starts with the concepts behind these interfaces. Then it gives an overview of existing approaches to the new paradigm. This articles concludes by listing available resources, such as a glossary and references to books, journals, conferences, and people working in the field. This should allow readers to delve deeper into the promising new field of narrative interfaces.

Go See also the editorial From GUIs to Narrative Interfaces – From Point-and-Click to Computers Telling Stories and the stories article Future Scope - Tangible Information.

Go See also the Narrative User Interfaces Glossary and Links and References for Narrative User Interfaces (both taken from this article)

 

Concepts

This introductory section describes the ideas behind narrative user interfaces and related concepts. For definitions of these and further concepts, see the glossary.

Storytelling

We tell and surround ourselves with stories from the early days of our childhood and all through our adult lives. By telling stories, we make sense of the world: We order its events and find meaning in them by assimilating them to more or less familiar narratives (Mateas & Sengers, 1998). The psychologist Bruner (1991), for example, argues that narrative is fundamental to human understanding of intentional behavior.

Research under the storytelling paradigm comprises support for human storytelling, especially for kids, data bases of stories in which stories describe how people handled commonly occurring problem situations, the design of story-understanding systems, as well as autonomous "intelligent" agents that engage in conversations with users. The latter is handled below in the context of narrative user interfaces.

Narrative Intelligence

Blair and Meyer (1997) call the human ability to organize experiences into narrative form, that is the ability to tell stories that follow a certain dramaturgy, logic, or pattern, "narrative intelligence." This term is also used as the name of a new interdisciplinary research field with contributions from artificial intelligence (AI), computer science, human-computer interaction (HCI), and many humanistic fields of study. Currently, the field does not have a clear definition – as is the case for any new research direction. Of the many threads that are pursued, the design of narrative intelligent agents and narrative user interfaces are among the most promising. Both approaches are based on the argument that systems can be made more understandable if they communicate in ways that are easy to assimilate to narrative.

Narrative User Interfaces

Narrative user interfaces attempt to mimic the communication behavior of humans. Computers talk to people, listen to them, and even take the situational context into account. While the conversational aspect of narrative interfaces is based on the storytelling paradigm, these interfaces typically do not stop here. One additional ingredient of these interfaces is goal-driven behavior: The computer drives the conversation with users in the pursuit of goals, which are given by the software application. Thus, the computer behaves proactively, following a strategy, and no longer waits passively for user commands. This approach is based on the argument that humans use narrative for understanding intentional behavior. It is typically implemented using intelligent agents that frame their behavior into a narrative structure, which, for example, enables them to make behavioral choices.

For achieving a more human-like communication, narrative interfaces may utilize humanoid representations, such as avatars, which have realistic facial expressions, lip-synch speech and can express emotions. That is, narrative interfaces may also include ingredients taken from social and affective computing, which are presented below. They may also include "natural" physical devices for people to interact with the system as known from the tangible media research field. All in all, users interact with the computer in a "mixed reality" of virtual and physical objects, in which the computer's behavior comes close to human behavior, and therefore is easy for humans to understand.

While the idea and reasoning behind narrative interfaces is simple and easy to get, these interfaces are hard to realize. Their implementation draws heavily on techniques from artificial intelligence, such as speech recognition and production, detection and creation of emotions, planning, learning, autonomous agents, and pattern recognition. They also require state-of-the-art graphical rendering techniques and sensors that mimic the abilities of human senses.

Authors of conversational applications have to take into account how humans tell stories and conduct conversations. The dialog between human users and the computer has to follow a certain dramaturgy that finally leads to the fulfillment of the computer's goal. Therefore, techniques from the film industry, theater, and literature have to be integrated into the design of narrative user interfaces.

Social and Affective Computing

The terms "social computing" and "affective computing" refer to certain aspects of narrative interfaces but are also research directions in their own right.

As Reeves and Nass (1996) demonstrated in their famous book The Media Equation, people tend to interact with computers as if they were human beings. Social Computing aims at supporting this human tendency for human-like communication. There are two directions in this research field: narrative user interfaces based on storytelling, and synthetic characters, which can exhibit emotions that depend on the current situation. In real-world systems both directions are often combined, as the examples below and our introductory scenario demonstrate.

The term "affective computing" was probably coined by Rosalind Picard, from the MIT Media Lab, who wrote an influential book with this title in 1997. Affective computing is concerned with the means to recognize and synthesize "emotional intelligence." Whereas emotional intelligence includes both bodily and mental events, affective computing presently focuses mainly on the apparent characteristics of verbal and nonverbal communication in relatively simple settings (Duric et al. 2002). As an example, research areas in affective computing at the MIT Media Lab comprise, among others:

  • Sensing human affect signals and recognizing patterns of affective expression
  • Understanding and modeling emotional experience, as well as synthesizing emotions in machines
  • Affective computing applications and interfaces with affective computers

Thus, there is considerable overlap between research in social and affective computing. Both directions, like narrative interfaces, requires the application of diverse disciplines ranging from agent systems, models of emotion, graphics, and interface design, to sociology and psychology, and even art, drama, and animation techniques.

For an overview of the affective computing field, see the Affective Computing Portal, which links to many resources. You can also find information on emotions and computers in CHI 2002 Changing the World, Changing Ourselves on the SAP Design Guild Website.

 

Examples

Below are selected examples of prototype systems that study the capabilities of narrative user interfaces. This selection is by no means complete but should suffice for a first impression of the work going on in the field. There is also theoretically-oriented work going on, such as the research by Chrystopher Nehaniv.

Fraunhofer: IZA – Information zum Anfassen (Tangible Information)

The IZA project lead by Prof. Encarnação at Fraunhofer IDG, Germany explores narrative user interfaces with the prototype of a digital trade show booth. This includes affective behavior, goal-directed interaction, and a multimodal user interface in a mixed-reality environment. The introductory scenario shows how the prototype system works: The plot for the trade show booth delivers the structure for a conversation between the booth's intelligent agent and a visitor approaching it. The story includes a greeting and introduction phase, which may hopefully lead to a conversation. In the course of the conversation the systems offers information and tries to collect the visitor's business card – this is the ultimate goal the system pursues. The story closes with a leave scenario.

The research team's vision for humane applications is that the computer engages in a natural conversation that takes context-relevant information into account, and that interaction is goal-oriented and makes use of human senses beyond vision and hearing. The natural interaction style is supported by integrating physical objects and the environment into the user interface.

This vision transcends the current desktop metaphor by utilizing multimodal interaction and narrative environments. According to the research team, it attempts to combine technology and the best qualities of people to create the perfect environment for success ("clicks and mortar").

In 2003 the research group started the new project "Virtual Human," which builds on the experiences gained in the IZA project.

For more information see www.inigraphics.net/publications/topics/2002/issue2/Topics%202_2002.pdf (PDF document) and the "Virtual Human" Website: www.virtual-human.org/start_en.html (site under construction).

Digital trade show booth     Demo of digital trade show booth at SAP  

Figure 1: Digital trade show booth (left; from Fraunhofer IDG) and demo of it at SAP (right)

University of Tokyo: SCREAM (Scripting Emotion-Based Agents Minds)

Helmut Prendinger and Mitsuru Ishizuka developed SCREAM (scripting emotion-based agents minds), a system that allows users to script a character's affect-related capabilities. This system is intended as a plug-in to content and task specific agents systems, such as interactive tutoring or entertainment systems that provide possible verbal utterances for a character. The system is built on top of the Microsoft agents player technology and has been tested in three different scenarios, each of which focuses on a particular aspect of the agent architecture:

  • Coffee shop scenario: impact of social role awareness
  • Casino scenario: attitude changes depending on user input
  • Japanese comic scenario: provides familiarity change in addition to attitude change

MIT Media Lab: GNL - Gesture and Narrative Language Group

There are several groups at the MIT Media Lab that do research in the field of narrative user interfaces and related areas. This article focuses on the work of the Gesture and Narrative Language Group (GNL) lead by Justine Cassell, which explores several systems based on narrative interfaces. The work of Rosalind Picard and her Affective Computing group are not covered here.

MACK: Media Lab Autonomous Conversational Kiosk

MACK is a life-sized, on-screen, animated robot that explains the Lab's people, projects, and groups to Media Lab visitors, and gives directions about how to find them within the Lab. The agent shares a model of the Lab with the visitor, around which the two participants can center their discussion. MACK is an Embodied Conversation Agent (ECA), which uses a combination of speech, gesture, and reference to a normal paper map that users place on a table between themselves and MACK.

For more information see www.media.mit.edu/gnl/projects/kiosk/.

  MACK is a life-sized, on-screen, animated robot
MACK is a life-sized, on-screen, animated robot MACK is a life-sized, on-screen, animated robot

Figure 2: MACK is a life-sized, on-screen, animated robot

Life-Like Avatars

Many networked virtual communities, such as Multi-User Domains (MUDs) or chat rooms, where people meet in a fictitious place, offer graphical representations of the places and the people that inhabit them. Visitors that come to the environment choose a character, called an avatar, that represents them in this world. They can then explore the environment by moving the avatar around. The avatars of other users currently logged onto the system can also be seen and approached to initiate a conversation. Even though these systems are graphically rich, communication is still mostly based on text messages or digitized speech streams sent between users. The Gesture and Narrative Language Group is looking at ways to make communication, mediated through avatars, more lifelike and natural through appropriate and meaningful animation of the avatar's body and face.

For more information see avatars.www.media.mit.edu/avatars/.

BodyChat      Situated Chat      BEAT

Body Chat

 

Situated Chat

 

BEAT

Figure 3: Three different avatars created by the Media Labs GNL group (from MIT Media Lab)

Body Chat is an early prototype of a graphical chat system that allows users to communicate using text while their avatars automatically animate attention, salutations, turn taking, back-channel feedback, and facial expression, as well as simple body functions such as the blinking of the eyes.

For more information see www.media.mit.edu/groups/gn/projects/bodychat/

Situated Chat builds on the experiences gained with BodyChat and automatically animates the visual representations (avatars) of the participants of an online chat. BodyChat concentrated on the use of a social model to animate appropriate social behavior, such as greetings and farewells. Situated Chat also builds a model of the discourse context, taking into account the shared visual environment, and then uses it to generate appropriate nonverbal behavior, such as referring gestures.

For more information, see www.media.mit.edu/groups/gn/projects/situchat/

The Behavior Expression Animation Toolkit (BEAT) allows animators to type text that they wish to be spoken by an animated human figure, and to obtain as output appropriate and synchronized nonverbal behaviors and synthesized speech. The nonverbal behaviors are assigned on the basis of actual linguistic and contextual analysis of the typed text, relying on rules derived from extensive research into human conversational behavior.

For more information, see gn.www.media.mit.edu/groups/gn/projects/beat/

Rea: The Conversational Humanoid (MIT)

Bringing in knowledge from human discourse analysis and social cognition, the research team is developing autonomous agents that are capable of having a real-time face-to-face conversation with a human. These agents are human in form and communicate using both verbal and nonverbal modalities.

Rea

Figure 4: Rea greeting screen shot (from MIT Media Lab)

For more information see, gn.www.media.mit.edu/groups/gn/projects/humanoid/

Older Projects: SAGE

SAGE stands for Storytelling Agent Generation Environment. It is an environment that supports kids in storytelling activities. Children can create artificial storytellers as projections of fears, feelings, interests and role-models. Thus, the storytellers allowed them to explore their own inner life as well as to present themselves to others. SAGE was, for example, tested with children suffering from cardiac illnesses in a Boston hospital. This project can be subsumed under systems that support human storytelling in contrast to autonomous agents that themselves tell stories and conduct conversations.

SAGE

Figure 5: The SAGE storytellers are embedded in a soft interface: a programmable interactive stuffed rabbit (from MIT Media Lab)

For more information, see xenia.media.mit.edu/~marinau/Sage/

 

Glossary

  • Affective Computing: Affective computing is concerned with the means to recognize and synthesize "emotional intelligence." Whereas emotional intelligence includes both bodily and mental events, affective computing presently focuses mainly on the apparent characteristics of verbal and nonverbal communication in relatively simple settings (Duric et al. 2002). As an example, research areas in affective computing at the MIT Media Lab comprise, among others:
    • Sensing human affect signals and recognizing patterns of affective expression
    • Understanding and modeling emotional experience, as well as synthesizing emotions in machines
    • Affective computing applications and interfaces with affective computers
  • Agents: Definitions from different sources at MIT:
    • A software agent is a program that performs tasks for its user (Leonard Foner, foner.www.media.mit.edu/people/foner/agents.html)
    • Software agents differ from conventional software in that they are long-lived, semi-autonomous, proactive, and adaptive (MIT Media Lab, Software Agents Group, agents.media.mit.edu/index.html)
      Adaptive agents are able to learn and to change their behavior based on experience, while fixed agenda/rule-based agents have a predefined behavior.
  • Artificial Agents: Synonym for (software) agents. Emphasizes that agents are artifacts.
  • Autonomous (Intelligent) Agents: Synonym for (software) agents. Emphasizes the autonomy and in-built intelligence (planning, inference) of software agents. Autonomy means that agents act without user intervention (for example, an agent may search the Web for relevant papers).
  • Avatar: In the Hindu religion, an avatar is an incarnation of a deity; hence, an embodiment or manifestation of an idea or greater reality. In 3D or virtual reality games and in some chat forums on the Web, an avatar is the visual "handle" or display appearance visitors use to represent themselves ("bodily incarnation in Cyberspace"). Generally speaking, an avatar is a "virtual" representation of a human.
  • Clicks and Mortar: Clicks and mortar (sometimes written clicks-and-mortar) is a term describing traditional old economy companies that are taking advantage of the Internet and the new economy it has introduced. The term derives from bricks and mortar, used in the context of the Web to describe traditional companies with physical (rather than Web site) locations.
    As typically used in the media, a clicks and mortar company is one that has begun to exploit the Internet, not only in marketing and sales, but also in terms of its total business process. A clicks and mortar firm would be likely to take part in business-to-business (B2B) exchanges. (The term's origin is attributed to David Pottruck, CEO of Charles Schwab Corp.) (from searchCRM.com, adapted)
    The research group of Prof. Encarnação (Fraunhofer IDG) uses this term slightly differently: Clicks and mortar means to marry technology and the best qualities of people to create the perfect environment for success.
  • Conversational Agents (Embodied Conversational Agents): Autonomous intelligent agent systems that engage in a natural conversation with users; these agents are called embodied if they are represented by a physical embodiment, such as an avatar.
  • Embodiment: Describes the way in which users are themselves directly represented within the display space. That is, users are considered within the space, not as observers looking onto it. This does not necessarily imply the use of immersive interface technologies, only that the user has some representation within the space. Embodiment may convey the awareness of presence, identity, and activity to other people. Avatars may be used for embodiment. (from Steve Benford, adapted)
  • Emotional Intelligence: Emotional intelligence is a type of social intelligence that involves the ability to monitor one's own and others' emotions, to discriminate among them, and to use the information to guide one's thinking and actions (Mayer & Salovey, 1993: 433).
  • HCI (Human-Computer Interaction): Human-computer interaction is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them. (From ACM SIGCHI Curricula for Human-Computer Interaction)
  • IHCI (Intelligent Human-Computer Interaction): Discipline that models human cognitive, perceptual, motor, and affective factors and uses them to adapt the human-computer interface. (From Duric et al.)
  • Immersive User Interfaces: Immersive user interfaces create the illusion for users of being inside a computer-generated world where they can manipulate objects. Typically, these interfaces are used in 3D and Virtual Reality (VR) applications.
  • Intelligent Agents: Synonym for (software) agents. Intelligent agents are able to show planning and inference behavior.
  • Multi-Modal User Interfaces: User interfaces that utilize several human senses (or interaction modes), for example, touch, or physical actions.
  • Narrative Environments: For example, learning or entertainment environments with a narrative structure, that is, that are based on the storytelling paradigm. This approach holds that such environments communicate more effectively and are more memorable, and thus are more effective themselves.
  • Narrative Intelligence: The human ability to organize experiences into narrative form, that is the ability of telling stories that follow a certain dramaturgy, logic, or pattern, "narrative intelligence." The term is also the name of a new interdisciplinary research field with contributions from artificial intelligence (AI), computer science, human-computer interaction (HCI), and many humanistic fields of study.
  • Narrative Intelligence Hypothesis: This hypothesis suggests that the evolutionary origin of communicating in a narrative format co-evolved with increasingly complex social dynamics among our human ancestors (from Dautenhahn)
    See also narrative intelligence
  • Narrative User Interfaces: User interfaces or environments (e.g. learning environments) based upon the storytelling paradigm. Narrative interfaces may show goal-directed behavior, and may have humanoid representations, such as avatars (which may show lip-sync speech and emotions) and multi-modal interaction.
  • Social Computing: Social computing aims to support the tendency of humans to interact with computers as if they were veritable social actors. Social computing may, for example, be realized by using life-like synthetic characters with context aware affective behavior (avatars). Another avenue is to implement systems with narrative intelligence, which meets the tendency of humans to frame other agents' behavior into narrative. (H. Prendinger & M. Ishizuka, 2001, modified by author)
  • Software Agents: See agents
  • Storytelling, Storytelling Paradigm: The storytelling paradigm is based on the assumption that by telling stories we make sense of the world: We order its events and find meaning in them by assimilating them to more or less familiar narratives. The psychologist Bruner (1991) argues that narrative is fundamental to human understanding of intentional behavior.
    Research under the storytelling paradigm comprises support for human storytelling, especially for children, data bases of stories in which stories describe how people handled commonly occurring problem situations, the design of story-understanding systems, as well as autonomous "intelligent" agents.

 

References

Organizations, Research Institutes

Conferences

Journals, Magazines, Online Resources

Books

Articles

  • Michael Mateas & Phoebe Sengers. Narrative Intelligence. Introduction to the Narrative Intelligence Symposium, AAAI 1999 Fall Symposium Series (PDF)
  • Ulrike Spierling. Conversational Integration of Multimedia and Multimodal Interaction. Proceedings of the CHI 2000, Amsterdam. ACM, 37-38.
  • Ulrike Spierling. Storytelling Metaphors for Human-Computer Interaction in Mixed Reality. Computer Graphics, 2/2002 (15). (PDF)
  • Helmut Prendinger & Mitsuru Ishizuka. Social Computing - Lifelike Characters a Social Actors. Proceedings 1st Salzburg Workshop on Paradigms of Cognition (SWPC 1/2002), Salzburg, Austria, 2002. (PDF)
  • Chrystopher L. Nehaniv, Narrative for Artifacts: Transcending Context and Self, In: Phoebe Sengers & Michael Mateas, eds., Narrative Intelligence: Papers from the 1999 AAAI Fall Symposium, (November 5-7, 1999, North Falmouth, Massachusetts) FS-99-01, American Association for Artificial Intelligence, pp. 101-104, 1999 (PDF)
  • Tom Stocky & Justine Cassell. Shared Reality: Spatial Intelligence in Intuitive User Interfaces. Proceedings of the IUI'02, January 13-16, 2002, San Francisco, CA; USA. (PDF)
  • Justine Cassell et al. MACK: Media Lab Autonomous Conversational Kiosk. Proceedings of IMAGINA'02, January 12-15, 2002, Monte Carlo, Monaco. (PDF)
  • Marina Umaschi & Justine Cassell. Storytelling Systems: Constructing the Innerface of the Interface. Cognitive Technologies Proceedings '97, IEEE, pp.98-108.(PDF)

People

 

top top