__NOTOC__
Kismet is a
robot
A robot is a machine—especially one programmable by a computer—capable of carrying out a complex series of actions automatically. A robot can be guided by an external control device, or the control may be embedded within. Robots may be ...
head made in the 1990s at
Massachusetts Institute of Technology
The Massachusetts Institute of Technology (MIT) is a Private university, private Land-grant university, land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern t ...
by Dr.
Cynthia Breazeal
Cynthia Breazeal is an American robotics scientist and entrepreneur. She is a former chief scientist and chief experience officer of Jibo, a company she co-founded in 2012 that developed personal assistant robots. Her most recent work has focus ...
as an experiment in
affective computing
Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While so ...
; a machine that can recognize and simulate
emotion
Emotions are mental states brought on by neurophysiological changes, variously associated with thoughts, feelings, behavioral responses, and a degree of pleasure or displeasure. There is currently no scientific consensus on a definition. ...
s. The name Kismet comes from a
Turkish word meaning "fate" or sometimes "luck".
[
]
Hardware design and construction
In order for Kismet to properly interact with human beings, it contains input devices that give it auditory, visual
The visual system comprises the sensory organ (the eye) and parts of the central nervous system (the retina containing photoreceptor cells, the optic nerve, the optic tract and the visual cortex) which gives organisms the sense of sight ...
, and proprioception
Proprioception ( ), also referred to as kinaesthesia (or kinesthesia), is the sense of self-movement, force, and body position. It is sometimes described as the "sixth sense".
Proprioception is mediated by proprioceptors, mechanosensory neurons ...
abilities. Kismet simulates emotion through various facial expressions, vocalizations, and movement. Facial expressions are created through movements of the ears, eyebrows, eyelids, lips, jaw, and head. The cost of physical materials was an estimated US$25,000.[
In addition to the equipment mentioned above, there are four Motorola 68332s, nine 400 MHz PCs, and another 500 MHz PC.][Peter Menzel and Faith D'Aluisio. Robosapiens. Cambridge: The MIT Press, 2000. Pg. 66]
Software system
Kismet's social intelligence software system, or synthetic nervous system (SNS), was designed with human models of intelligent behavior in mind. It contains six subsystems as follows.
Low-level feature extraction system
This system processes raw visual and auditory information from cameras and microphones. Kismet's vision system can perform eye detection, motion detection and, albeit controversial, skin-color detection. Whenever Kismet moves its head, it momentarily disables its motion detection system to avoid detecting self-motion. It also uses its stereo cameras to estimate the distance of an object in its visual field, for example to detect threats—large, close objects with a lot of movement.
Kismet's audio system is mainly tuned towards identifying affect in infant-directed speech
Baby talk is a type of speech associated with an older person speaking to a child or infant. It is also called caretaker speech, infant-directed speech (IDS), child-directed speech (CDS), child-directed language (CDL), caregiver register, parent ...
. In particular, it can detect five different types of affective speech: approval, prohibition, attention, comfort, and neutral. The affective intent classifier was created as follows. Low-level features such as pitch mean and energy (volume) variance were extracted from samples of recorded speech. The classes of affective intent were then modeled as a gaussian mixture model
In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation ...
and trained with these samples using the expectation-maximization algorithm. Classification is done with multiple stages, first classifying an utterance into one of two general groups (e.g. soothing/neutral vs. prohibition/attention/approval) and then doing more detailed classification. This architecture significantly improved performance for hard-to-distinguish classes, like ''approval'' ("You're a clever robot") versus ''attention'' ("Hey Kismet, over here").
Motivation system
Dr. Breazeal figures her relations with the robot as 'something like an infant-caretaker interaction, where I'm the caretaker essentially, and the robot is like an infant'. The overview sets the human-robot relation within a frame of learning, with Dr. Breazeal providing the scaffolding for Kismet's development. It offers a demonstration of Kismet's capabilities, narrated as emotive facial expressions that communicate the robot's 'motivational state', Dr. Brazeal: "This one is anger (laugh) extreme anger, disgust, excitement, fear, this is happiness, this one is interest, this one is sadness, surprise, this one is tired, and this one is sleep."
At any given moment, Kismet can only be in one emotional state at a time. However, Breazeal states that Kismet is not conscious, so it does not have feelings.
Motor system
Kismet speaks a proto-language with a variety of phonemes, similar to baby's babbling. It uses the DECtalk
DECtalk was a speech synthesizer and text-to-speech technology developed by Digital Equipment Corporation in 1983, based largely on the work of Dennis Klatt at MIT, whose source-filter algorithm was variously known as KlattTalk or MITalk.
Us ...
voice synthesizer, and changes pitch, timing, articulation, etc. to express various emotions. Intonation is used to vary between question and statement-like utterances. Lip synchronization was important for realism, and the developers used a strategy from animation:[Madsen, R. ''Animated Film: Concepts, Methods, Uses''. Interland, New York, 1969] "simplicity is the secret to successful lip animation." Thus, they did not try to imitate lip motions perfectly, but instead "create a visual short hand that passes unchallenged by the viewer."
See also
* Affective computing
Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While so ...
* Artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
References
External links
*
Description de Kismet
{{DEFAULTSORT:Kismet (Robot)
Humanoid robots
Prototype robots
Social robots
Robots of the United States
Massachusetts Institute of Technology
1990s robots
Robot heads