The Rich Representation Language, often abbreviated as RRL, is a
computer animation
Computer animation is the process used for digitally generating Film, moving images. The more general term computer-generated imagery (CGI) encompasses both still images and moving images, while computer animation refers to moving images. Virtu ...
language specifically designed to facilitate the interaction of two or more animated characters.
[Intelligent virtual agents: 6th international working conference'' by Jonathan Matthew Gratch 2006 page 221][P. Piwek, et al. ''RRL: A Rich Representation Language for the Description of Agent Behaviour'' in "Proceedings of the AAMAS-02 Workshop on Embodied conversational agents", July 16, 2002, Bologna, Italy.] The research effort was funded by the
European Commission
The European Commission (EC) is the primary Executive (government), executive arm of the European Union (EU). It operates as a cabinet government, with a number of European Commissioner, members of the Commission (directorial system, informall ...
as part of the
NECA Project. The NECA (Net Environment for Embodied Emotional Conversational Agents) framework within which RRL was developed was not oriented towards the animation of movies, but the creation of intelligent "virtual characters" that interact within a
virtual world
A virtual world (also called a virtual space or spaces) is a Computer simulation, computer-simulated environment which may be populated by many simultaneous users who can create a personal Avatar (computing), avatar and independently explore th ...
and hold conversations with
emotional
Emotions are physical and mental states brought on by neurophysiology, neurophysiological changes, variously associated with thoughts, feelings, behavior, behavioral responses, and a degree of pleasure or suffering, displeasure. There is ...
content, coupled with suitable
facial expressions
Facial expression is the motion and positioning of the muscles beneath the skin of the face. These movements convey the emotional state of an individual to observers and are a form of nonverbal communication. They are a primary means of conveying ...
.
RRL was a pioneering research effort which influenced the design of other languages such as the ''Player Markup Language'' which extended parts of the design of RRL. The language design specifically intended to lessen the training needed for modeling the interaction of multiple characters within a virtual world and to automatically generate much of the
facial animation
Computer facial animation is primarily an area of computer graphics that encapsulates methods and techniques for generating and animating images or models of a character face. The character can be a human, a humanoid, an animal, a legendary creatu ...
as well as the
skeletal animation
Skeletal animation or rigging is a technique in computer animation in which a character (or other articulated object) is represented in two parts: a polygonal or parametric mesh representation of the surface of the object, and a hierarchical set ...
based on the content of the conversations. Due to the interdependence of
nonverbal communication
Nonverbal communication is the transmission of messages or signals through a nonverbal platform such as eye contact (oculesics), body language (kinesics), social distance (proxemics), touch (Haptic communication, haptics), voice (prosody (lingui ...
components such as facial features on the spoken words, no animation is possible in the language without considering the
context of the scene in which the animation takes place - e.g. anger versus joy.
[''Interactive storytelling'': First Joint International Conference, edited by Ulrike Spierling, Nicolas Szilas 2008 page 93]
Language design issues
The application domain for RRL consists of scenes with two or more virtual characters. The representation of these scenes requires multiple information types such as
body postures,
facial expressions
Facial expression is the motion and positioning of the muscles beneath the skin of the face. These movements convey the emotional state of an individual to observers and are a form of nonverbal communication. They are a primary means of conveying ...
,
semantic content and meaning of conversations, etc. The design challenge is that often information of one type is dependent on another type of information, e.g. the body posture, the facial expression and the semantic content of the conversation need to coordinate. An example is that in an angry conversation, the semantics of the conversation dictate the body posture and facial expressions in a distinct from which is quite different from a joyful conversation. Hence any commands within the language to control facial expressions must inherently depend on the context of the conversation.
The different types of information used in RRL require different forms of expression within the language, e.g. while semantic information is represented by
grammar
In linguistics, grammar is the set of rules for how a natural language is structured, as demonstrated by its speakers or writers. Grammar rules may concern the use of clauses, phrases, and words. The term may also refer to the study of such rul ...
s, the facial expression component requires graphic manipulation primitives.
A key goal in the design of RRL was the ease of development, to make scenes and interaction construction available to users without advanced knowledge of programming. Moreover, the design aimed to allow for incremental development in a natural form, so that scenes could be partially prototyped, then refined to more natural looking renderings, e.g. via the later addition of blinking or breathing.
Scene description
Borrowing theatrical terminology, each interaction session between the synthetic characters in RRL is called a ''scene''. A scene description specifies the content, timing, and emotional features employed within a scene. A specific module called the ''affective reasoner'' computes the
emotional primitives involved in the scene, including the type and the intensity of the emotions, as well as their causes. The affective reasoner uses ''emotion dimensions'' such as intensity and assertiveness.
Although XML is used as the base representation format, the scenes are described at a higher level within an object oriented framework. In this framework nodes (i.e. objects) are connected via arrows or links. For instance, a scene is the top level node which is linked to others. The scene may have three specific attributes: the agents/people who participate in the scene, the discourse representation which provides the basis for conversations and a history which records the temporal relationships between various actions.
The scene descriptions are fed to the natural language generation module which produces suitable sentences. The generation of natural flow in a conversation requires a high degree of representational power for the emotional elements. RRL uses a discourse representation system based the standard method of ''referents'' and ''conditions''. The affective reasoner supplies the suitable information to select the words and structures that correspond to specific sentences.
Speech synthesis and emotive markers
The speech synthesis component is highly dependent on the semantic information and the
behavior of the gesture assignment module. The speech synthesis component must operate before the gesture assignment system because it includes the timing information for the spoken words and emotional
interjection
An interjection is a word or expression that occurs as an utterance on its own and expresses a spontaneous feeling, situation or reaction. It is a diverse category, with many different types, such as exclamations ''(ouch!'', ''wow!''), curses (''da ...
s. After interpreting the natural language text to be spoken, this component adds
prosodic
In linguistics, prosody () is the study of elements of speech, including intonation (linguistics), intonation, stress (linguistics), stress, Rhythm (linguistics), rhythm and loudness, that occur simultaneously with individual phonetic segments: v ...
structure such as rhythm, stress and intonations.
The speech elements, once enriched with stress, intonation and emotional markers are passed to the gesture assignment system.
RRL supports three separate aspects of emotion management. First, specific emotion tags may be provided for scenes and specific sentences. A number of specific commands support the display a wide range of emotions in the faces of animated characters.
Secondly, there are built in mechanisms for aligning specific facial features to emotive body postures. Third, specific emotive interjections such as sighs, yawns, chuckles, etc. may be interleaved within actions to enhance the believability of the character's utterances.
Gesture assignment and body movements
In RRL the term gesture is used in a general sense and applies to facial expressions, body posture and proper gestures. Three levels of information are processed within gesture assignment:
* Assignment of specific gestures within a scene to specific modules, e.g. "
turn taking" being handled in the natural language generation module.
* Refinement and elaboration of gesture assignment following a first level synthesis of speech, e.g. the addition of blinking and breathing to a conversation.
* Interface to external modules that handle player-specific renderings such as
MPEG-4
MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related ...
Face Animation Parameters (FAPs).
The gesture assignment system has specific gesture types such as body movements (e.g. shrug of shoulders as indifference vs hanging shoulders of sadness), emblematic movements (gestures that by convention signal yes/no), iconic (e.g. imitating a telephone via fingers), deictic (pointing gestures), contrast (e.g. on one hand, but on the other hand), facial features (e.g. raised eyebrows, frowning, surprise or a gaze).
See also
*
Virtual Human Markup Language
*
Humanoid animation
*
Virtual actor
References
{{Reflist, 30em
Computer languages
Computer animation