Intelligent personal assistant
   HOME

TheInfoList



OR:

An intelligent virtual assistant (IVA) or intelligent personal assistant (IPA) is a
software agent In computer science, a software agent or software AI is a computer program that acts for a user or other program in a relationship of agency, which derives from the Latin ''agere'' (to do): an agreement to act on one's behalf. Such "action on beha ...
that can perform tasks or services for an individual based on commands or questions. The term "
chatbot A chatbot or chatterbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Designed to convincingly simulate the way a human would behav ...
" is sometimes used to refer to virtual assistants generally or specifically accessed by
online chat Online chat may refer to any kind of communication over the Internet that offers a real-time transmission of text messages from sender to receiver. Chat messages are generally short in order to enable other participants to respond quickly. Ther ...
. In some cases, online chat programs are exclusively for entertainment purposes. Some virtual assistants are able to interpret human speech and respond via synthesized voices. Users can ask their assistants questions, control home automation devices and media playback via voice, and manage other basic tasks such as email, to-do lists, and calendars with verbal commands. A similar concept, however with differences, lays under the
dialogue system A dialogue system, or conversational agent (CA), is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and o ...
s. As of 2017, the capabilities and usage of virtual assistants are expanding rapidly, with new products entering the market and a strong emphasis on both email and
voice user interface A voice-user interface (VUI) makes spoken human interaction with computers possible, using speech recognition to understand spoken commands and answer questions, and typically text to speech to play a reply. A voice command device is a device con ...
s. Apple and Google have large installed bases of users on
smartphone A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
s. Microsoft has a large installed base of
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ser ...
-based
personal computers A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tec ...
, smartphones and
smart speakers A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "hot word" (or several "hot words"). Some smart speakers can a ...
. Amazon has a large install base for smart speakers. Conversica has over 100 million engagements via its email and SMS interface intelligent virtual assistants for business. Now the Virtual Assistant does not refer only to a machine but a person whose primary job is to help his employer to do a specific online job virtually. Most of time this person is residing in another part of the world.


History


Experimental decades: 1910s–1980s

Radio Rex was the first voice activated toy released in 1922. It was a wooden toy in the shape of a dog that would come out of its house when its name is called. In 1952,
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial Research and development, research and scientific developm ...
presented "Audrey", the Automatic Digit Recognition machine. It occupied a six- foot-high relay rack, consumed substantial power, had streams of cables and exhibited the myriad maintenance problems associated with complex vacuum-tube circuitry. It could recognize the fundamental units of speech, phonemes. It was limited to accurate recognition of digits spoken by designated talkers. It could therefore be used for voice dialing, but in most cases push-button dialing was cheaper and faster, rather than speaking the consecutive digits. Another early tool which was enabled to perform digital speech recognition was the IBM Shoebox voice-activated calculator, presented to the general public during the 1962 Seattle World's Fair after its initial market launch in 1961. This early computer, developed almost 20 years before the introduction of the first
IBM Personal Computer The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the IBM PC model line and the basis for the IBM PC compatible de facto standard. Released on August 12, 1981, it was created by a team ...
in 1981, was able to recognize 16 spoken words and the digits 0 to 9. The first natural language processing computer program or the
chatbot A chatbot or chatterbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Designed to convincingly simulate the way a human would behav ...
ELIZA ELIZA is an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. Created to demonstrate the superficiality of communication between humans and machines, ...
was developed by MIT professor
Joseph Weizenbaum Joseph Weizenbaum (8 January 1923 – 5 March 2008) was a German American computer scientist and a professor at MIT. The Weizenbaum Award is named after him. He is considered one of the fathers of modern artificial intelligence. Life and caree ...
in the 1960s. It was created to "demonstrate that the communication between man and machine was superficial". ELIZA used pattern matching and substitution methodology into scripted responses to simulate conversation, which gave an illusion of understanding on the part of the program. Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people. This gave name to the
ELIZA effect The ELIZA effect, in computer science, is the tendency to unconsciously assume computer behaviors are analogous to human behaviors; that is, anthropomorphisation. Overview In its specific form, the ELIZA effect refers only to "the susceptibility ...
, the tendency to unconsciously assume computer behaviors are analogous to human behaviors; that is, anthropomorphisation, a phenomenon present in human interactions with virtual assistants. The next milestone in the development of voice recognition technology was achieved in the 1970s at the Carnegie Mellon University in
Pittsburgh Pittsburgh ( ) is a city in the Commonwealth of Pennsylvania, United States, and the county seat of Allegheny County. It is the most populous city in both Allegheny County and Western Pennsylvania, the second-most populous city in Pennsylva ...
, Pennsylvania with substantial support of the
United States Department of Defense The United States Department of Defense (DoD, USDOD or DOD) is an executive branch department of the federal government charged with coordinating and supervising all agencies and functions of the government directly related to national sec ...
and its
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Ad ...
agency, funded five years of a Speech Understanding Research program, aiming to reach a minimum vocabulary of 1,000 words. Companies and academia including IBM, Carnegie Mellon University (CMU) and Stanford Research Institute took part in the program. The result was "Harpy", it mastered about 1000 words, the vocabulary of a three-year-old and it could understand sentences. It could process speech that followed pre-programmed vocabulary, pronunciation, and grammar structures to determine which sequences of words made sense together, and thus reducing speech recognition errors. In 1986 Tangora was an upgrade of the Shoebox, it was a voice recognizing typewriter. Named after the world's fastest typist at the time, it had a vocabulary of 20,000 words and used prediction to decide the most likely result based on what was said in the past. IBM's approach was based on a
hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an o ...
, which adds statistics to digital signal processing techniques. The method makes it possible to predict the most likely
phoneme In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...
s to follow a given phoneme. Still each speaker had to individually train the typewriter to recognize his or her voice, and pause between each word.


Birth of smart virtual assistants: 1990s–present

In the 1990s, digital speech recognition technology became a feature of the personal computer with IBM,
Philips Koninklijke Philips N.V. (), commonly shortened to Philips, is a Dutch multinational conglomerate corporation that was founded in Eindhoven in 1891. Since 1997, it has been mostly headquartered in Amsterdam, though the Benelux headquarters i ...
and
Lernout & Hauspie Lernout & Hauspie Speech Products, or L&H, was a Belgium-based speech recognition technology company, founded by Jo Lernout and Pol Hauspie, that went bankrupt in 2001 because of a fraud engineered by the management. The company was based in Ypr ...
fighting for customers. Much later the market launch of the first
smartphone A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
IBM Simon The IBM Simon Personal Communicator (simply known as IBM Simon) is a handheld, touchscreen PDA designed by International Business Machines (IBM), and manufactured by Mitsubishi Electric. Although the term "smartphone" was not coined until 199 ...
in 1994 laid the foundation for smart virtual assistants as we know them today. In 1997, Dragon's Naturally Speaking software could recognize and transcribe natural human speech without pauses between each word into a document at a rate of 100 words per minute. A version of Naturally Speaking is still available for download and it is still used today, for instance, by many doctors in the US and the UK to document their medical records. In 2001
Colloquis Colloquis, previously known as ActiveBuddy and Conversagent, was a company that created conversation-based interactive agents originally distributed via instant messaging platforms. The company had offices in New York, NY and Sunnyvale, CA. Found ...
publicly launched
SmarterChild SmarterChild was a chatbot available on AOL Instant Messenger and Windows Live Messenger (previously MSN Messenger) networks. History SmarterChild was an intelligent agent or "bot" developed by ActiveBuddy, Inc., with offices in New York and Su ...
, on platforms like AIM and
MSN Messenger MSN Messenger (also known colloquially simply as "Messenger"), later rebranded as Windows Live Messenger, was a cross-platform instant-messaging client developed by Microsoft. It connected to the Microsoft Messenger service and, in later versio ...
. While entirely text-based SmarterChild was able to play games, check the weather, look up facts, and converse with users to an extent. The first modern digital virtual assistant installed on a smartphone was
Siri Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer qu ...
, which was introduced as a feature of the
iPhone 4S The iPhone 4S (originally styled as iPhone 4 S, retroactively stylized with a lowercase 's' as iPhone 4s as of September 2013) is a smartphone that was designed and marketed by Apple Inc. It is the fifth generation of the iPhone, succ ...
on 4 October 2011.
Apple Inc. Apple Inc. is an American multinational technology company headquartered in Cupertino, California, United States. Apple is the largest technology company by revenue (totaling in 2021) and, as of June 2022, is the world's biggest company ...
developed Siri following the 2010 acquisition of Siri Inc., a spin-off of
SRI International SRI International (SRI) is an American nonprofit scientific research institute and organization headquartered in Menlo Park, California. The trustees of Stanford University established SRI in 1946 as a center of innovation to support economic ...
, which is a research institute financed by
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Ad ...
and the
United States Department of Defense The United States Department of Defense (DoD, USDOD or DOD) is an executive branch department of the federal government charged with coordinating and supervising all agencies and functions of the government directly related to national sec ...
. Its aim was to aid in tasks such as sending a text message, making phone calls, checking the weather or setting up an alarm. Over time, it has developed to provide restaurant recommendations, search the internet, and provide driving directions. In November 2014, Amazon announced Alexa alongside the Echo. In April 2017 Amazon released a service for building conversational interfaces for any type of virtual assistant or interface.


Method of interaction

Virtual assistants work via: * Text, including:
online chat Online chat may refer to any kind of communication over the Internet that offers a real-time transmission of text messages from sender to receiver. Chat messages are generally short in order to enable other participants to respond quickly. Ther ...
(especially in an
instant messaging Instant messaging (IM) technology is a type of online chat allowing real-time text transmission over the Internet or another computer network. Messages are typically transmitted between two or more parties, when each user inputs text and trigge ...
application or other application ), SMS text,
e-mail Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant ...
or other text-based communication channel, for example Conversica's intelligent virtual assistants for business. * Voice, for example with
Amazon Alexa Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesiser named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio ...
on the
Amazon Echo Amazon Echo, often shortened to Echo, is an American brand of smart speakers developed by Amazon. Echo devices connect to the voice-controlled intelligent personal assistant service '' Alexa'', which will respond when a user says "Alexa". Users ...
device,
Siri Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer qu ...
on an iPhone, or
Google Assistant Google Assistant is a virtual assistant software application developed by Google that is primarily available on mobile and home automation devices. Based on artificial intelligence, Google Assistant can engage in two-way conversations, unlike t ...
on Google-enabled/ Android mobile devices * By taking and/or uploading images, as in the case of
Samsung The Samsung Group (or simply Samsung) ( ko, 삼성 ) is a South Korean multinational manufacturing conglomerate headquartered in Samsung Town, Seoul, South Korea. It comprises numerous affiliated businesses, most of them united under the ...
Bixby on the
Samsung Galaxy S8 The Samsung Galaxy S8 and Samsung Galaxy S8+ are Android smartphones produced by Samsung Electronics as the eighth generation of the Samsung Galaxy S series. The S8 and S8+ were unveiled on 29 March 2017 and directly succeeded the Samsung Gal ...
Some virtual assistants are accessible via multiple methods, such as
Google Assistant Google Assistant is a virtual assistant software application developed by Google that is primarily available on mobile and home automation devices. Based on artificial intelligence, Google Assistant can engage in two-way conversations, unlike t ...
via chat on the
Google Allo Google Allo was an instant messaging mobile app by Google for the Android and iOS mobile operating systems, with a web client available on Google Chrome, Mozilla Firefox, and Opera. It closed on March 12, 2019. The app used phone numbers as ...
and Google Messages app and via voice on
Google Home Google Nest, previously named Google Home, is a line of smart speakers developed by Google under the Google Nest brand. The devices enable users to speak voice commands to interact with services through Google Assistant, the company's virtual ...
smart speaker A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "hot word" (or several "hot words"). Some smart speakers can a ...
s. Virtual assistants use natural language processing (NLP) to match user text or voice input to executable commands. Many continually learn using
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
techniques including
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
and
ambient intelligence In computing, ambient intelligence (AmI) refers to electronic environments that are sensitive and responsive to the presence of people. Ambient intelligence was a projection on the future of consumer electronics, telecommunications and comput ...
. Some of these assistants like Google Assistant (which contains
Google Lens Google Lens is an image recognition technology developed by Google, designed to bring up relevant information related to objects it identifies using visual analysis based on a neural network. First announced during Google I/O 2017, it was firs ...
) and Samsung Bixby also have the added ability to do image processing to recognize objects in the image to help the users get better results from the clicked images. To activate a virtual assistant using the voice, a wake word might be used. This is a word or groups of words such as "Hey Siri", "OK Google" or "Hey Google", "Alexa", and "Hey Microsoft". As virtual assistants become more popular, there are increasing legal risks involved.


Devices and objects where found

Virtual assistants may be integrated into many types of platforms or, like
Amazon Alexa Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesiser named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio ...
, across several of them: * Into devices like
smart speakers A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "hot word" (or several "hot words"). Some smart speakers can a ...
such as
Amazon Echo Amazon Echo, often shortened to Echo, is an American brand of smart speakers developed by Amazon. Echo devices connect to the voice-controlled intelligent personal assistant service '' Alexa'', which will respond when a user says "Alexa". Users ...
,
Google Home Google Nest, previously named Google Home, is a line of smart speakers developed by Google under the Google Nest brand. The devices enable users to speak voice commands to interact with services through Google Assistant, the company's virtual ...
and Apple HomePod * In
instant messaging Instant messaging (IM) technology is a type of online chat allowing real-time text transmission over the Internet or another computer network. Messages are typically transmitted between two or more parties, when each user inputs text and trigge ...
applications on both smartphones and via the Web, e.g.
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Mosk ...
's M (virtual assistant) on both
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Mosk ...
and
Facebook Messenger Messenger is a proprietary instant messaging app and platform developed by Meta Platforms. Originally developed as Facebook Chat in 2008, the company revamped its messaging service in 2010, released standalone iOS and Android apps in 2011, and ...
apps or via the Web * Built into a
mobile operating system A mobile operating system is an operating system for mobile phones, tablets, smartwatches, smartglasses, or other non-laptop personal mobile computing devices. While computers such as typical laptops are "mobile", the operating systems used on ...
(OS), as are
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, where its wild ancestor, ' ...
's
Siri Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer qu ...
on
iOS iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also include ...
devices and BlackBerry Assistant on
BlackBerry 10 BlackBerry 10 is a discontinued proprietary mobile operating system for the BlackBerry line of smartphones, both developed by BlackBerry Limited (formerly Research In Motion). BlackBerry 10 is based on QNX, a Unix-like operating system that was ...
devices, or into a desktop OS such as Cortana on Microsoft Windows OS * Built into a smartphone independent of the OS, as is Bixby on the
Samsung Galaxy S8 The Samsung Galaxy S8 and Samsung Galaxy S8+ are Android smartphones produced by Samsung Electronics as the eighth generation of the Samsung Galaxy S series. The S8 and S8+ were unveiled on 29 March 2017 and directly succeeded the Samsung Gal ...
and Note 8. * Within instant messaging platforms, assistants from specific organizations, such as Aeromexico's Aerobot on Facebook Messenger or Wechat Secretary on
WeChat WeChat () is a Chinese instant messaging, social media, and mobile payment app developed by Tencent. First released in 2011, it became the world's largest standalone mobile app in 2018, with over 1 billion monthly active users. WeChat has b ...
* Within mobile apps from specific companies and other organizations, such as Dom from
Domino's Pizza Domino's Pizza, Inc., trading as Domino's, is an American multinational pizza restaurant chain founded in 1960 and led by CEO Russell Weiner. The corporation is Delaware domiciled and headquartered at the Domino's Farms Office Park in Ann Arbor ...
* In appliances, cars, and
wearable technology Wearable technology is any technology that is designed to be used while worn. Common types of wearable technology include smartwatches and smartglasses. Wearable electronic devices are often close to or on the surface of the skin, where they detec ...
. * Previous generations of virtual assistants often worked on websites, such as
Alaska Airlines Alaska Airlines is a major American airline headquartered in SeaTac, Washington, within the Seattle metropolitan area. It is the sixth largest airline in North America when measured by fleet size, scheduled passengers carried, and the num ...
' Ask Jenn, or on
interactive voice response Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telecommunications, IVR allows customers to interac ...
(IVR) systems such as
American Airlines American Airlines is a major airlines of the United States, major US-based airline headquartered in Fort Worth, Texas, within the Dallas–Fort Worth metroplex. It is the Largest airlines in the world, largest airline in the world when measured ...
' IVR by Nuance.


Services

Virtual assistants can provide a wide variety of services. These include: * Provide information such as weather, facts from e.g.
Wikipedia Wikipedia is a multilingual free online encyclopedia written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and using a wiki-based editing system. Wikipedia is the largest and most-read refer ...
or
IMDb IMDb (an abbreviation of Internet Movie Database) is an online database of information related to films, television series, home videos, video games, and streaming content online – including cast, production crew and personal biographies, ...
, set an alarm, make to-do lists and shopping lists * Play music from streaming services such as
Spotify Spotify (; ) is a proprietary Swedish audio streaming and media services provider founded on 23 April 2006 by Daniel Ek and Martin Lorentzon. It is one of the largest music streaming service providers, with over 456 million monthly active us ...
and Pandora; play radio stations; read
audiobooks An audiobook (or a talking book) is a recording of a book or other work being read out loud. A reading of the complete text is described as "unabridged", while readings of shorter versions are abridgements. Spoken audio has been available in sc ...
* Play videos, TV shows or movies on televisions, streaming from e.g.
Netflix Netflix, Inc. is an American subscription video on-demand over-the-top streaming service and production company based in Los Gatos, California. Founded in 1997 by Reed Hastings and Marc Randolph in Scotts Valley, California, it offers a fi ...
*
Conversational commerce Conversational commerce is e-commerce done via various means of conversation ( live support on e-commerce Web sites, online chat using messaging apps, chatbots on messaging apps or websites, voice assistants) and using technology such as: speech re ...
(see below) *Assist public interactions with government (see Artificial intelligence in government) * Complement and/or replace customer service by humans. One report estimated that an automated online assistant produced a 30% decrease in the work-load for a human-provided
call centre A call centre ( Commonwealth spelling) or call center ( American spelling; see spelling differences) is a managed capability that can be centralised or remote that is used for receiving or transmitting a large volume of enquiries by telephon ...
. * In-Car Voice Assistants Interaction with virtual assistants like Siri and Alexa has become common today. It allows people to get more comfortable and have an exceptional driving experience.


Conversational commerce

Conversational commerce is
e-commerce E-commerce (electronic commerce) is the activity of electronically buying or selling of products on online services or over the Internet. E-commerce draws on technologies such as mobile commerce, electronic funds transfer, supply chain managem ...
via various means of messaging, including via voice assistants but also
live chat A web chat is a system that allows users to communicate in real-time using easily accessible web interfaces. It is a type of Internet online chat distinguished by its simplicity and accessibility to users who do not wish to take the time to insta ...
on e-commerce
Web sites A website (also written as a web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web server. Examples of notable websites are Google, Facebook, Amazon, and Wikip ...
, live chat on messaging applications such as
WeChat WeChat () is a Chinese instant messaging, social media, and mobile payment app developed by Tencent. First released in 2011, it became the world's largest standalone mobile app in 2018, with over 1 billion monthly active users. WeChat has b ...
,
Facebook Messenger Messenger is a proprietary instant messaging app and platform developed by Meta Platforms. Originally developed as Facebook Chat in 2008, the company revamped its messaging service in 2010, released standalone iOS and Android apps in 2011, and ...
and
WhatsApp WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows use ...
and
chatbots A chatbot or chatterbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Designed to convincingly simulate the way a human would behav ...
on messaging applications or Web sites.


Customer support

A virtual assistant can work with customer support team of a business to provide 24x7 support to customers. It provides quick responses, which enhances a customer's experience.


Third-party services

Amazon enables Alexa "Skills" and Google "Actions", essentially applications that run on the assistant platforms.


Virtual assistant privacy

Virtual assistants have a variety of privacy concerns associated with them. Features such as activation by voice pose a threat, as such features requires the device to always be listening. Modes of privacy such as the virtual security button have been proposed to create a multilayer authentication for virtual assistants.


Privacy policy of prominent virtual assistants


Google Assistant

The privacy policy of Google Assistant states that it does not store the audio data without the user's permission, but may store the conversation transcripts to personalise its experience. Personalisation can be turned off in settings. If a user wants Google Assistant to store audio data, they can go to Voice & Audio Activity (VAA) and turn on this feature. Audio files are sent to the cloud and used by Google to improve the performance of Google Assistant, but only if the VAA feature is turned on.


Amazon's Alexa

The privacy policy of Amazon's virtual assistant,
Alexa Alexa may refer to: Technology *Amazon Alexa, a virtual assistant developed by Amazon * Alexa Internet, a defunct website ranking and traffic analysis service * Arri Alexa, a digital motion picture camera People * Alexa (name), a given name a ...
, states that it only listens to conversations when its wake word (like Alexa, Amazon, Echo) is used. It starts recording the conversation after the call of a wake word, and stops recording after 8 seconds of silence. It sends the recorded conversation to the cloud. It is possible to delete the recording from the cloud by visiting ‘Alexa Privacy’ in ‘Alexa’.


Apple's Siri

Apple states that it does not record audio to improve Siri. Instead, it uses transcripts. Transcript data is only sent if it is deemed important for analysis. Users can opt out anytime if they don't want Siri to send the transcripts in the cloud.


Presumed and observed interest for the consumer


Presumed added value as allowing a new way of interactions

Added value of the virtual assistants can come among others from the following: * Voice communication can sometimes represent the optimal man-machine communication: # It is convenient: there are some sectors where voice is the only way of possible communication, and more generally, it allows to free-up both hands and vision potentially for doing another activity in parallel, or helps also disabled people. # It is faster: Voice is more efficient than writing on a keyboard: we can speak up to 200
words per minute Words per minute, commonly abbreviated wpm (sometimes uppercased WPM), is a measure of words processed in a minute, often used as a measurement of the speed of typing, reading or Morse code sending and receiving. Alphanumeric entry Since word ...
opposed to 60 in case of writing on a keyboard. It is also more natural thus requiring less effort (reading a text however can reach 700 words per minute). * Virtual assistants save a lot of time by automation: they can take appointments, or read the news while the consumer does something else. It is also possible to ask the virtual assistant to schedule meetings, hence helping to organize time. The designers of new digital schedulers explained the ambition they had that these calendars schedule lives to make the consumer use his time more efficiently, through machine learning processes, and complete organization of work time and free time. As an example when the consumer expresses the desire of scheduling a break, the VA will schedule it at an optimal moment for this purpose (for example at a time of the week where they are less productive), with the additional long-term objective of being able to schedule and organize the free time of the consumer, to assure them optimal work efficiency.


Perceived interest

*According to a recent study (2019), the two reasons for using virtual assistants for consumers are perceived usefulness and perceived enjoyment. The first result of this study is that both perceived usefulness and perceived enjoyment have an equivalent very strong influence for the consumer willingness to use a virtual assistant. * The second result of this study is that: # Provided content quality has a very strong influence on perceived usefulness and a strong influence on perceived enjoyment. # Visual attractiveness has a very strong influence on perceived enjoyment. # Automation has a strong influence on perceived usefulness.


Controversies


Artificial intelligence controversies

* Virtual assistants spur the
filter bubble A filter bubble or ideological frame is a state of intellectual isolationTechnopediaDefinition – What does Filter Bubble mean?, Retrieved October 10, 2017, "....A filter bubble is the intellectual isolation, that can occur when websites make us ...
: As for
social media Social media are interactive media technologies that facilitate the creation and sharing of information, ideas, interests, and other forms of expression through virtual communities and networks. While challenges to the definition of ''social medi ...
, virtual assistants’ algorithms are trained to show pertinent data and discard others based on previous activities of the consumer: The pertinent data is the one which will interest or please the consumer. As a result, they become isolated from data that disagrees with their viewpoints, effectively isolating them into their own intellectual bubble, and reinforcing their opinions. This phenomenon was known to reinforce
fake news Fake news is false or misleading information presented as news. Fake news often has the aim of damaging the reputation of a person or entity, or making money through advertising revenue.Schlesinger, Robert (April 14, 2017)"Fake news in reality ...
and echo chambers. * Virtual assistants are also sometimes criticized for being overrated. In particular, A. Casilli points out that the AI of virtual assistants are neither intelligent nor artificial for two reasons: # Not intelligent because all they do is being the assistant of the human, and only by doing tasks that a human could do easily, and in a very limited specter of actions: find, class, and present information, offers or documents. Also, virtual assistants are neither able to make decisions on their own nor to anticipate things. # And not artificial because they would be impossible without human labelization through micro working.


Ethics implications

In 2019 Antonio A. Casilli, a French sociologist, criticized artificial intelligence and virtual assistants in particular in the following way: At a first level the fact that the consumer provides free data for the training and improvement of the virtual assistant, often without knowing it, is ethically disturbing. But at a second level, it might be even more ethically disturbing to know how these
AIs AIS may refer to: Medicine * Abbreviated Injury Scale, an anatomical-based coding system to classify and describe the severity of injuries * Acute ischemic stroke, the thromboembolic type of stroke * Androgen insensitivity syndrome, an intersex ...
are trained with this data. This
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
is trained via neural networks, which require a huge amount of labelled data. However, this data needs to be labelled through a human process, which explains the rise of microwork in the last decade. That is, remotely using some people worldwide doing some repetitive and very simple tasks for a few cents, such as listening to virtual assistant speech data, and writing down what was said. Microwork has been criticized for the job insecurity it causes, and for the total lack of regulation: The average salary was 1,38 dollar/hour in 2010, and it provides neither healthcare nor retirement benefits, sick pay, minimum wage. Hence, virtual assistants and their designers are controversial for spurring job insecurity, and the AIs they propose are still human in the way that they would be impossible without the microwork of millions of human workers. Privacy concerns are raised by the fact that voice commands are available to the providers of virtual assistants in unencrypted form, and can thus be shared with third parties and be processed in an unauthorized or unexpected manner. Additionally to the linguistic content of recorded speech, a user's manner of expression and voice characteristics can implicitly contain information about his or her biometric identity, personality traits, body shape, physical and mental health condition, sex, gender, moods and emotions, socioeconomic status and geographical origin.


Developer platforms

Notable developer platforms for virtual assistants include: *
Amazon Lex Amazon Lex is a service for building conversational interfaces into any application using voice and text. It powers the Amazon Alexa virtual assistant. In April 2017, the platform was released to the developer community, and suggested that it cou ...
was opened to developers in April 2017. It involves
natural language understanding Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension. Natural-language understanding is considered an A ...
technology combined with automatic speech recognition and had been introduced in November 2016. * Google provides the Actions on Google and
Dialogflow Dialogflow is a natural language understanding platform used to design and integrate a conversational user interface into mobile apps, web applications, devices, bots, interactive voice response systems and related uses. History In May 2012, Sp ...
platforms for developers to create "Actions" for Google Assistant * Apple provides SiriKit for developers to create extensions for
Siri Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer qu ...
* IBM's Watson, while sometimes spoken of as a virtual assistant is in fact an entire
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
platform and community powering some virtual assistants,
chatbots A chatbot or chatterbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Designed to convincingly simulate the way a human would behav ...
. and many other types of solutions.


Previous generations

In previous generations of text chat-based virtual assistants, the assistant was often represented by an
avatar Avatar (, ; ), is a concept within Hinduism that in Sanskrit literally means "descent". It signifies the material appearance or incarnation of a powerful deity, goddess or spirit on Earth. The relative verb to "alight, to make one's appeara ...
(a.k.a. ''interactive online character'' or ''automated character'') — this was known as an
embodied agent In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically ...
.


Economic relevance


For individuals

Digital experiences enabled by virtual assistants are considered to be among the major recent technological advances and most promising consumer trends. Experts claim that digital experiences will achieve a status-weight comparable to ‘real’ experiences, if not become more sought-after and prized. The trend is verified by a high number of frequent users and the substantial growth of worldwide user numbers of virtual digital assistants. In mid-2017, the number of frequent users of digital virtual assistants is estimated to be around 1 bn worldwide. In addition, it can be observed that virtual digital assistant technology is no longer restricted to smartphone applications, but present across many industry sectors (incl. automotive, telecommunications,
retail Retail is the sale of goods and services to consumers, in contrast to wholesaling, which is sale to business or institutional customers. A retailer purchases goods in large quantities from manufacturers, directly or through a wholesaler, and ...
, healthcare and education). In response to the significant R&D expenses of firms across all sectors and an increasing implementation of mobile devices, the market for speech recognition technology is predicted to grow at a
CAGR Compound annual growth rate (CAGR) is a business and investing specific term for the geometric progression ratio that provides a constant rate of return over the time period. CAGR is not an accounting term, but it is often used to describe some ele ...
of 34.9% globally over the period of 2016 to 2024 and thereby surpass a global market size of US$7.5 billion by 2024. According to an Ovum study, the "native digital assistant installed base" is projected to exceed the world's population by 2021, with 7.5 billion active voice AI–capable devices. According to Ovum, by that time "Google Assistant will dominate the voice AI–capable device market with 23.3% market share, followed by Samsung's Bixby (14.5%), Apple's Siri (13.1%), Amazon's Alexa (3.9%), and Microsoft's Cortana (2.3%)." Taking into consideration the regional distribution of market leaders, North American companies (e.g.
Nuance Communications Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software. Nuance merged with its compe ...
, IBM, eGain) are expected to dominate the industry over the next years, due to the significant impact of BYOD (
Bring Your Own Device Bring your own device (BYOD )—also called bring your own technology (BYOT), bring your own phone (BYOP), and bring your own personal computer (BYOPC)—refers to being allowed to use one's personally owned device, rather than being required to u ...
) and enterprise mobility business models. Furthermore, the increasing demand for smartphone-assisted platforms are expected to further boost the North American intelligent virtual assistant (IVA) industry growth. Despite its smaller size in comparison to the North American market, the intelligent virtual assistant industry from the Asia-Pacific region, with its main players located in India and China is predicted to grow at an annual growth rate of 40% (above global average) over the 2016–2024 period.


Economic opportunity for enterprises

Virtual assistants should not be only seen as a gadget for individuals, as they could have a real economic utility for enterprises. As an example, a virtual assistant can take the role of an always available assistant with an encyclopedic knowledge. And which can organize meetings, check inventories, verify informations. Virtual assistants are all the more important that their integration in small and middle-sized enterprises often consists in an easy first step through the more global adaptation and use of Internet of Things (IoT). Indeed, IoT technologies are first perceived by small and medium-sized enterprises as technologies of critical importance, but too complicated, risky or costly to be used.


Security

In May 2018, researchers from the
University of California, Berkeley The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California) is a public land-grant research university in Berkeley, California. Established in 1868 as the University of California, it is the state's first land-grant u ...
, published a paper that showed audio commands undetectable for the human ear could be directly embedded into music or spoken text, thereby manipulating virtual assistants into performing certain actions without the user taking note of it. The researchers made small changes to audio files, which cancelled out the sound patterns that speech recognition systems are meant to detect. These were replaced with sounds that would be interpreted differently by the system and command it to dial phone numbers, open websites or even transfer money. The possibility of this has been known since 2016, and affects devices from
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, where its wild ancestor, ' ...
,
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technolog ...
and
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
. In addition to unintentional actions and voice recording, another security and privacy risk associated with intelligent virtual assistants is malicious voice commands: An attacker who impersonates a user and issues malicious voice commands to, for example, unlock a smart door to gain unauthorized entry to a home or garage or order items online without the user's knowledge. Although some IVAs provide a voice-training feature to prevent such impersonation, it can be difficult for the system to distinguish between similar voices. Thus, a malicious person who is able to access an IVA-enabled device might be able to fool the system into thinking that they are the real owner and carry out criminal or mischievous acts.


Comparison of notable assistants


See also

*
Ambient intelligence In computing, ambient intelligence (AmI) refers to electronic environments that are sensitive and responsive to the presence of people. Ambient intelligence was a projection on the future of consumer electronics, telecommunications and comput ...
*
Applications of artificial intelligence Artificial intelligence (AI) has been used in applications to alleviate certain problems throughout industry and academia. AI, like electricity or computers, is a general purpose technology that has a multitude of applications. It has been used ...
*
Autonomous agent An autonomous agent is an intelligent agent operating on a user's behalf but without any interference of that user. An intelligent agent, however appears according to an IBM white paper as: Intelligent agents are software entities that carry out ...
*
Chatbot A chatbot or chatterbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Designed to convincingly simulate the way a human would behav ...
*
ChatGPT ChatGPT (Generative Pre-trained Transformer) is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning) with both supervised and ...
* Conversational user interface *
Computer facial animation Computer facial animation is primarily an area of computer graphics that encapsulates methods and techniques for generating and animating images or models of a character face. The character can be a human, a humanoid, an animal, a legendary creatu ...
*
Embodied agent In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically ...
* Expert system *
Friendly artificial intelligence Friendly artificial intelligence (also friendly AI or FAI) refers to hypothetical artificial general intelligence (AGI) that would have a positive (benign) effect on humanity or at least align with human interests or contribute to foster the impro ...
* Home network *
Hybrid intelligent system Hybrid intelligent system denotes a software system which employs, in parallel, a combination of methods and techniques from artificial intelligence subfields, such as: * Neuro-symbolic systems * Neuro-fuzzy systems * Hybrid connectionist-symbolic ...
*
Intelligent agent In artificial intelligence, an intelligent agent (IA) is anything which perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or may use knowledge. They may be simple or ...
*
Interactions Corporation Interactions LLC is a privately held technology company that builds and delivers hosted Virtual Assistant applications that enable businesses to deliver automated natural language communications for enterprise customer care. History Interaction ...
*
Knowledge Navigator The Knowledge Navigator is a concept described by former Apple Computer CEO John Sculley in his 1987 book, ''Odyssey: Pepsi to Apple''. It describes a device that can access a large networked database of hypertext information, and use software ...
* Microsoft Office Assistant *
Mobile operating system A mobile operating system is an operating system for mobile phones, tablets, smartwatches, smartglasses, or other non-laptop personal mobile computing devices. While computers such as typical laptops are "mobile", the operating systems used on ...
*
Multi-agent system A multi-agent system (MAS or "self-organized system") is a computerized system composed of multiple interacting intelligent agents.Hu, J.; Bhowmick, P.; Jang, I.; Arvin, F.; Lanzon, A.,A Decentralized Cluster Formation Containment Framework f ...
* Natural language processing * Simulated reality *
Social bot A social bot, or also described as a social AI or social algorithm, is a software agent that communicates autonomously on social media. The messages (e.g. tweets) it distributes can be simple and operate in groups and various configurations wit ...
* Software bot *
Software agent In computer science, a software agent or software AI is a computer program that acts for a user or other program in a relationship of agency, which derives from the Latin ''agere'' (to do): an agreement to act on one's behalf. Such "action on beha ...
*
Wizard (software) A software wizard or setup assistant is a user interface that presents a dialog box to lead the user through a sequence of small steps. It's often used to configure a program or service for the first time. Complex, rare, or unfamiliar tasks may be ...


References

{{Authority control * Applications of artificial intelligence Agent-based software Customer service Chatbots