A virtual assistant (VA) is a
software agent
In computer science, a software agent is a computer program that acts for a user or another program in a relationship of agency.
The term ''agent'' is derived from the Latin ''agere'' (to do): an agreement to act on one's behalf. Such "action on ...
that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate
chatbot
A chatbot (originally chatterbot) is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of main ...
capabilities to streamline task execution. The interaction may be via text, graphical interface, or voice - as some virtual assistants are able to interpret human speech and respond via synthesized voices.
In many cases, users can ask their virtual assistants questions, control home automation devices and media playback, and manage other basic tasks such as email, to-do lists, and calendars - all with verbal commands. In recent years, prominent virtual assistants for direct consumer use have included
Apple
An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
's
Siri
Siri ( , backronym: Speech Interpretation and Recognition Interface) is a digital assistant purchased, developed, and popularized by Apple Inc., which is included in the iOS, iPadOS, watchOS, macOS, Apple TV, audioOS, and visionOS operating sys ...
,
Amazon Alexa,
Google Assistant, and
Samsung
Samsung Group (; stylised as SΛMSUNG) is a South Korean Multinational corporation, multinational manufacturing Conglomerate (company), conglomerate headquartered in the Samsung Town office complex in Seoul. The group consists of numerous a ...
's
Bixby. Also, companies in various industries often incorporate some kind of virtual assistant technology into their customer service or support.
Into the 2020s, the emergence of
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
based
chatbot
A chatbot (originally chatterbot) is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of main ...
s, such as
ChatGPT, has brought increased capability and interest to the field of virtual assistant products and services.
History
Experimental decades: 1910s–1980s
Radio Rex was the first voice activated toy, patented in 1916 and released in 1922. It was a wooden toy in the shape of a dog that would come out of its house when its name is called.
In 1952,
Bell Labs
Nokia Bell Labs, commonly referred to as ''Bell Labs'', is an American industrial research and development company owned by Finnish technology company Nokia. With headquarters located in Murray Hill, New Jersey, Murray Hill, New Jersey, the compa ...
presented "Audrey", the Automatic Digit Recognition machine. It occupied a six-foot-high relay rack, consumed substantial power, had streams of cables and exhibited the myriad maintenance problems associated with complex vacuum-tube circuitry. It could recognize the fundamental units of speech, phonemes. It was limited to accurate recognition of digits spoken by designated talkers. It could therefore be used for voice dialing, but in most cases push-button dialing was cheaper and faster, rather than speaking the consecutive digits.
Another early tool which was enabled to perform digital speech recognition was the
IBM Shoebox voice-activated calculator, presented to the general public during the
1962 Seattle World's Fair after its initial market launch in 1961. This early computer, developed almost 20 years before the introduction of the first
IBM Personal Computer
The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the IBM PC model line and the basis for the IBM PC compatible ''de facto'' standard. Released on August 12, 1981, it was created by a ...
in 1981, was able to recognize 16 spoken words and the digits 0 to 9.
The first
natural language processing
Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
computer program or the chatbot
ELIZA was developed by MIT professor
Joseph Weizenbaum
Joseph Weizenbaum (8 January 1923 – 5 March 2008) was a German-American computer scientist and a professor at Massachusetts Institute of Technology, MIT. He is the namesake of the Weizenbaum Award and the Weizenbaum Institute.
Life and career
...
in the 1960s. It was created to "demonstrate that the communication between man and machine was superficial". ELIZA used pattern matching and substitution methodology into scripted responses to simulate conversation, which gave an illusion of understanding on the part of the program.
Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people.
This gave name to the
ELIZA effect, the tendency to unconsciously assume computer behaviors are analogous to human behaviors; that is, anthropomorphisation, a phenomenon present in human interactions with virtual assistants.
The next milestone in the development of voice recognition technology was achieved in the 1970s at the
Carnegie Mellon University
Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania, United States. The institution was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools. In 1912, it became the Carnegie Institu ...
in
Pittsburgh
Pittsburgh ( ) is a city in Allegheny County, Pennsylvania, United States, and its county seat. It is the List of municipalities in Pennsylvania#Municipalities, second-most populous city in Pennsylvania (after Philadelphia) and the List of Un ...
, Pennsylvania with substantial support of the
United States Department of Defense
The United States Department of Defense (DoD, USDOD, or DOD) is an United States federal executive departments, executive department of the federal government of the United States, U.S. federal government charged with coordinating and superv ...
and its
DARPA
The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adva ...
agency, funded five years of a Speech Understanding Research program, aiming to reach a minimum vocabulary of 1,000 words. Companies and academia including IBM, Carnegie Mellon University (CMU) and Stanford Research Institute took part in the program.
The result was "Harpy", it mastered about 1000 words, the vocabulary of a three-year-old and it could understand sentences. It could process speech that followed pre-programmed vocabulary, pronunciation, and grammar structures to determine which sequences of words made sense together, and thus reducing speech recognition errors.
In 1986, Tangora was an upgrade of the Shoebox, it was a voice recognizing typewriter. Named after the world's fastest typist at the time, it had a vocabulary of 20,000 words and used prediction to decide the most likely result based on what was said in the past. IBM's approach was based on a
hidden Markov model, which adds statistics to digital signal processing techniques. The method makes it possible to predict the most likely
phoneme
A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s to follow a given phoneme. Still each speaker had to individually train the typewriter to recognize his or her voice, and pause between each word.
In 1983, Gus Searcy invented the "Butler In A Box", an electronic voice home controller system.
Birth of smart virtual assistants: 1990s–2010s
In the 1990s, digital speech recognition technology became a feature of the personal computer with
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
,
Philips
Koninklijke Philips N.V. (), simply branded Philips, is a Dutch multinational health technology company that was founded in Eindhoven in 1891. Since 1997, its world headquarters have been situated in Amsterdam, though the Benelux headquarter ...
and
Lernout & Hauspie fighting for customers. Much later the market launch of the first
smartphone
A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
IBM Simon
The IBM Simon Personal Communicator (simply known as IBM Simon) is a cellular phone and personal digital assistant (PDA) designed by International Business Machines (IBM), released in 1994. Built on an x86 processor, the IBM Simon features a ...
in 1994 laid the foundation for smart virtual assistants as we know them today.
In 1997, Dragon's
Naturally Speaking software could recognize and transcribe natural human speech without pauses between each word into a document at a rate of 100 words per minute. A version of Naturally Speaking is still available for download and it is still used today, for instance, by many doctors in the US and the UK to document their medical records.
In 2001
Colloquis publicly launched
SmarterChild, on platforms like
AIM and
MSN Messenger
MSN Messenger (also known colloquially simply as MSN), later rebranded as Windows Live Messenger, was a Cross-platform software, cross-platform instant messaging client, instant-messaging client developed by Microsoft. It connected to the now-di ...
. While entirely text-based SmarterChild was able to play games, check the weather, look up facts, and converse with users to an extent.
The first modern digital virtual assistant installed on a smartphone was
Siri
Siri ( , backronym: Speech Interpretation and Recognition Interface) is a digital assistant purchased, developed, and popularized by Apple Inc., which is included in the iOS, iPadOS, watchOS, macOS, Apple TV, audioOS, and visionOS operating sys ...
, which was introduced as a feature of the
iPhone 4S
The is a smartphone that was developed and marketed by Apple Inc. It is the List of iPhone models, fifth generation of the iPhone, succeeding the iPhone 4 and preceding the iPhone 5. It was announced on October 4, 2011, at Apple's Cupertino ...
on 4 October 2011.
Apple Inc. developed Siri following the 2010 acquisition of
Siri Inc., a
spin-off of
SRI International
SRI International (SRI) is a nonprofit organization, nonprofit scientific research, scientific research institute and organization headquartered in Menlo Park, California, United States. It was established in 1946 by trustees of Stanford Univer ...
, which is a research institute financed by
DARPA
The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adva ...
and the
United States Department of Defense
The United States Department of Defense (DoD, USDOD, or DOD) is an United States federal executive departments, executive department of the federal government of the United States, U.S. federal government charged with coordinating and superv ...
.
Its aim was to aid in tasks such as sending a text message, making phone calls, checking the weather or setting up an alarm. Over time, it has developed to provide restaurant recommendations, search the internet, and provide driving directions.
In November 2014, Amazon announced Alexa alongside the Echo.
In April 2017 Amazon released a service for building
conversational interfaces for any type of virtual assistant or interface.
Large Language Models: 2020s-present
In the 2020s, artificial intelligence (AI) systems like
ChatGPT have gained popularity for their ability to generate human-like responses to text-based conversations. In February 2020, Microsoft introduced its Turing Natural Language Generation (T-NLG), which was then the "largest language model ever published at 17 billion parameters."
On November 30, 2022, ChatGPT was launched as a prototype and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge. The advent of ChatGPT and its introduction to the wider public increased interest and competition in the space. In February 2023, Google began introducing an experimental service called "Bard" which is based on its
LaMDA program to generate text responses to questions asked based on information gathered from the
web
Web most often refers to:
* Spider web, a silken structure created by the animal
* World Wide Web or the Web, an Internet-based hypertext system
Web, WEB, or the Web may also refer to:
Computing
* WEB, a literate programming system created by ...
.
While ChatGPT and other generalized chatbots based on the latest
generative AI are capable of performing various tasks associated with virtual assistants, there are also more specialized forms of such technology that are designed to target more specific situations or needs.
Method of interaction
Virtual assistants work via:
* Text, including:
online chat
Online chat is any direct text-, audio- or video-based (webcams), one-on-one or one-to-many ( group) chat (formally also known as synchronous conferencing), using tools such as instant messengers, Internet Relay Chat (IRC), talkers and possi ...
(especially in an
instant messaging
Instant messaging (IM) technology is a type of synchronous computer-mediated communication involving the immediate ( real-time) transmission of messages between two or more parties over the Internet or another computer network. Originally involv ...
application or other application ),
SMS text,
e-mail
Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving Digital media, digital messages using electronics, electronic devices over a computer network. It was conceived in the ...
or other text-based communication channel, for example
Conversica's intelligent virtual assistants for business.
* Voice: for example with
Amazon Alexa on
Amazon Echo devices, Siri on an
iPhone
The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
, Google Assistant on Google-enabled
Android devices, or
Bixby on Samsung devices.
* Images: some assistants, such as Google Assistant (which includes
Google Lens) and Bixby on the
Samsung Galaxy series, have the added capability of performing
image processing
An image or picture is a visual representation. An image can be two-dimensional, such as a drawing, painting, or photograph, or three-dimensional, such as a carving or sculpture. Images may be displayed through other media, including a pr ...
to recognize objects in images.
Many virtual assistants are accessible via multiple methods, offering versatility in how users can interact with them, whether through chat, voice commands, or other integrated technologies.
Virtual assistants use
natural language processing
Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
(NLP) to match user text or voice input to executable commands. Some continually learn using
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
techniques including
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
and
ambient intelligence
Ambient intelligence (AmI) refers to environments with electronic devices that are aware of and can recognize the presence of human beings and adapt accordingly. This concept encompasses various technologies in consumer electronics, telecommunic ...
.
To activate a virtual assistant using the voice, a wake word might be used. This is a word or groups of words such as "Hey Siri", "OK Google" or "Hey Google", "Alexa", and "Hey Microsoft". As virtual assistants become more popular, there are increasing legal risks involved.
Devices and objects

Virtual assistants may be integrated into many types of platforms or, like Amazon Alexa, across several of them:
* Into devices like smart speakers such as Amazon Echo, Google Home and
Apple HomePod
* In
instant messaging
Instant messaging (IM) technology is a type of synchronous computer-mediated communication involving the immediate ( real-time) transmission of messages between two or more parties over the Internet or another computer network. Originally involv ...
applications on both smartphones and via the Web, e.g.
M (virtual assistant) on both
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
and
Facebook Messenger apps or via the Web
* Built into a
mobile operating system
A mobile operating system is an operating system used for smartphones, tablets, smartwatches, smartglasses, or other non-laptop personal mobile computing devices. While computers such as laptops are "mobile", the operating systems used on the ...
(OS), as are Apple's Siri on
iOS devices and BlackBerry Assistant on
BlackBerry 10
BlackBerry 10 (BB10) is a proprietary mobile operating system for the BlackBerry line of smartphones, both developed by BlackBerry Limited (formerly known as Research In Motion). Released in January 2013, BlackBerry 10 is a complete rework from t ...
devices, or into a desktop OS such as
Cortana on
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
OS
* Built into a smartphone independent of the OS, as is
Bixby on the
Samsung Galaxy S8 and
Note 8.
* Within instant messaging platforms, assistants from specific organizations, such as
Aeromexico's Aerobot on Facebook Messenger or
WeChat
WeChat or Weixin in Chinese ( zh, c=微信, p=Wēixìn , l=micro-message) is an instant messaging, social media, and mobile payment mobile app, app developed by Tencent. First released in 2011, it became the world's largest standalone mobile a ...
Secretary.
* Within mobile apps from specific companies and other organizations, such as Dom from
Domino's Pizza
* In appliances, cars, and
wearable technology such as the
Ai Pin
* Previous generations of virtual assistants often worked on websites, such as
Alaska Airlines' Ask Jenn, or on
interactive voice response
Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telephony, IVR allows customers to interact with a ...
(IVR) systems such as
American Airlines
American Airlines, Inc. is a major airlines of the United States, major airline in the United States headquartered in Fort Worth, Texas, within the Dallas–Fort Worth metroplex, and is the Largest airlines in the world, largest airline in the ...
' IVR by
Nuance.
Services
Virtual assistants can provide a wide variety of services. These include:
* Provide information such as weather, facts from e.g.
Wikipedia
Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...
or
IMDb
IMDb, historically known as the Internet Movie Database, is an online database of information related to films, television series, podcasts, home videos, video games, and streaming content online – including cast, production crew and biograp ...
, set an alarm, make to-do lists and shopping lists
* Play music from streaming services such as
Spotify
Spotify (; ) is a List of companies of Sweden, Swedish Music streaming service, audio streaming and media service provider founded on 23 April 2006 by Daniel Ek and Martin Lorentzon. , it is one of the largest providers of music streaming services ...
and
Pandora; play radio stations; read
audiobooks
An audiobook (or a talking book) is a recording of a book or other work being read out loud. A reading of the complete text is described as "unabridged", while readings of shorter versions are abridgements.
Spoken audio has been available in sch ...
* Play videos, TV shows or movies on televisions, streaming from e.g.
Netflix
Netflix is an American subscription video on-demand over-the-top streaming service. The service primarily distributes original and acquired films and television shows from various genres, and it is available internationally in multiple lang ...
*
Conversational commerce (see below)
*Assist public interactions with government (see
Artificial intelligence in government)
* Complement and/or replace human customer service specialists
in domains like healthcare, sales, and banking. One report estimated that an automated online assistant produced a 30% decrease in the work-load for a human-provided
call centre
A call centre ( Commonwealth spelling) or call center ( American spelling; see spelling differences) is a managed capability that can be centralised or remote that is used for receiving or transmitting a large volume of enquiries by telephone ...
.
* Enhance the driving experience by enabling interaction with virtual assistants like Siri and Alexa while in the car.
Conversational commerce
Conversational commerce is
e-commerce
E-commerce (electronic commerce) refers to commercial activities including the electronic buying or selling products and services which are conducted on online platforms or over the Internet. E-commerce draws on technologies such as mobile co ...
via various means of messaging, including via voice assistants but also
live chat on e-commerce
Web sites
A website (also written as a web site) is any web page whose content is identified by a common domain name and is published on at least one web server. Websites are typically dedicated to a particular topic or purpose, such as news, education, ...
, live chat on messaging applications such as
WeChat
WeChat or Weixin in Chinese ( zh, c=微信, p=Wēixìn , l=micro-message) is an instant messaging, social media, and mobile payment mobile app, app developed by Tencent. First released in 2011, it became the world's largest standalone mobile a ...
, Facebook Messenger and
WhatsApp
WhatsApp (officially WhatsApp Messenger) is an American social media, instant messaging (IM), and voice-over-IP (VoIP) service owned by technology conglomerate Meta. It allows users to send text, voice messages and video messages, make vo ...
and
chatbots on messaging applications or Web sites.
Customer support
A virtual assistant can work with customer support team of a business to provide
24x7 support to customers. It provides quick responses, which enhances a customer's experience.
Third-party services
Amazon enables Alexa "Skills" and Google "Actions", essentially applications that run on the assistant platforms.
Privacy
Virtual assistants have a
variety of privacy concerns associated with them. Features such as activation by voice pose a threat, as such features requires the device to always be listening. Modes of privacy such as the virtual security button have been proposed to create a multilayer authentication for virtual assistants.
Google Assistant
The privacy policy of Google Assistant states that it does not store the audio data without the user's permission, but may store the conversation transcripts to personalise its experience. Personalisation can be turned off in settings. If a user wants Google Assistant to store audio data, they can go to Voice & Audio Activity (VAA) and turn on this feature. Audio files are sent to the cloud and used by Google to improve the performance of Google Assistant, but only if the VAA feature is turned on.
Amazon Alexa
The privacy policy of Amazon's virtual assistant, Alexa, states that it only listens to conversations when its wake word (like Alexa, Amazon, Echo) is used. It starts recording the conversation after the call of a wake word, and stops recording after 8 seconds of silence. It sends the recorded conversation to the cloud. It is possible to delete the recording from the cloud by visiting 'Alexa Privacy' in 'Alexa'.
Apple's Siri
Apple states that it does not record audio to improve Siri. Instead, it claims to use transcripts. Transcript data is only sent if it is deemed important for analysis. Users can opt out anytime if they don't want Siri to send the transcripts in the cloud.
Cortana
Cortana is a voice-only virtual assistant with singular authentication. This voice-activated device accesses user data to perform common tasks like checking weather or making calls, raising privacy concerns due to the lack of secondary authentication.
Consumer interest
Presumed added value as allowing a new way of interactions
Added value of the virtual assistants can come among others from the following:
* Voice communication can sometimes represent the optimal
man-machine communication:
# It is convenient: there are some sectors where voice is the only way of possible communication, and more generally, it allows to free-up both hands and vision potentially for doing another activity in parallel, or helps also disabled people.
# It is faster: Voice is more efficient than writing on a keyboard: we can speak up to 200
words per minute opposed to 60 in case of writing on a keyboard. It is also more natural thus requiring less effort (reading a text however can reach 700 words per minute).
* Virtual assistants save a lot of time by automation: they can take appointments, or read the news while the consumer does something else. It is also possible to ask the virtual assistant to schedule meetings, hence helping to organize time. The designers of new digital schedulers explained the ambition they had that these calendars schedule lives to make the consumer use his time more efficiently, through machine learning processes, and complete organization of work time and free time. As an example when the consumer expresses the desire of scheduling a break, the VA will schedule it at an optimal moment for this purpose (for example at a time of the week where they are less productive), with the additional long-term objective of being able to schedule and organize the free time of the consumer, to assure them optimal work efficiency.
Perceived interest

*According to a recent study (2019), the two reasons for using virtual assistants for consumers are perceived usefulness and perceived enjoyment. The first result of this study is that both perceived usefulness and perceived enjoyment have an equivalent very strong influence for the consumer willingness to use a virtual assistant.
* The second result of this study is that:
# Provided content quality has a very strong influence on perceived usefulness and a strong influence on perceived enjoyment.
# Visual attractiveness has a very strong influence on perceived enjoyment.
# Automation has a strong influence on perceived usefulness.
Controversies
Artificial intelligence controversies
* Virtual assistants spur the
filter bubble
A filter bubble or ideological frame is a state of intellectual isolationTechnopediaDefinition – What does Filter Bubble mean?, Retrieved October 10, 2017, "....A filter bubble is the intellectual isolation, that can occur when websites make ...
: As for
social media
Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...
, virtual assistants' algorithms are trained to show pertinent data and discard others based on previous activities of the consumer: The pertinent data is the one which will interest or please the consumer. As a result, they become isolated from data that disagrees with their viewpoints, effectively isolating them into their own intellectual bubble, and reinforcing their opinions. This phenomenon was known to reinforce
fake news and
echo chambers.
* Virtual assistants are also sometimes criticized for being overrated. In particular,
A. Casilli points out that the AI of virtual assistants are neither intelligent nor artificial for two reasons:
# Not intelligent because all they do is being the assistant of the human, and only by doing tasks that a human could do easily, and in a very limited specter of actions: find, class, and present information, offers or documents. Also, virtual assistants are neither able to make decisions on their own nor to anticipate things.
# And not artificial because they would be impossible without human labelization through
micro working.
Ethical implications
In 2019
Antonio A. Casilli, a French
sociologist, criticized artificial intelligence and virtual assistants in particular in the following way:
At a first level the fact that the consumer provides free data for the training and improvement of the virtual assistant, often without knowing it, is ethically disturbing.
But at a second level, it might be even more ethically disturbing to know how these
AIs are trained with this data.
This
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
is trained via
neural networks
A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...
, which require a huge amount of
labelled data. However, this data needs to be labelled through a human process, which explains the rise of
microwork in the last decade. That is, remotely using some people worldwide doing some repetitive and very simple tasks for a few cents, such as listening to virtual assistant speech data, and writing down what was said. Microwork has been criticized for the job insecurity it causes, and for the total lack of regulation: The average salary was 1,38
dollar/hour in 2010, and it provides neither healthcare nor retirement benefits,
sick pay,
minimum wage
A minimum wage is the lowest remuneration that employers can legally pay their employees—the price floor below which employees may not sell their labor. List of countries by minimum wage, Most countries had introduced minimum wage legislation b ...
. Hence, virtual assistants and their designers are controversial for spurring job insecurity, and the AIs they propose are still human in the way that they would be impossible without the microwork of millions of human workers.
Privacy concerns are raised by the fact that voice commands are available to the providers of virtual assistants in unencrypted form, and can thus be shared with third parties and be processed in an unauthorized or unexpected manner.
Additionally to the linguistic content of recorded speech, a user's manner of expression and voice characteristics can implicitly contain information about his or her biometric identity, personality traits, body shape, physical and mental health condition, sex, gender, moods and emotions, socioeconomic status and geographical origin.
Developer platforms
Notable developer platforms for virtual assistants include:
*
Amazon Lex
Amazon Lex is a service for building conversational interfaces into any application using voice and text. It powers the Amazon Alexa virtual assistant (artificial intelligence), virtual assistant. In April 2017, the platform was released to the de ...
was opened to developers in April 2017. It involves
natural language understanding technology combined with automatic speech recognition and had been introduced in November 2016.
* Google provides the
Actions on Google
Action may refer to:
* Action (philosophy), something which is done by a person
* Action principles the heart of fundamental physics
* Action (narrative), a literary mode
* Action fiction, a type of genre fiction
* Action game, a genre of video gam ...
and
Dialogflow platforms for developers to create "Actions" for Google Assistant
* Apple provides SiriKit for developers to create extensions for
Siri
Siri ( , backronym: Speech Interpretation and Recognition Interface) is a digital assistant purchased, developed, and popularized by Apple Inc., which is included in the iOS, iPadOS, watchOS, macOS, Apple TV, audioOS, and visionOS operating sys ...
*
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
's
Watson, while sometimes spoken of as a virtual assistant is in fact an entire
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
platform and community powering some virtual assistants,
chatbots. and many other types of solutions.
Previous generations
In previous generations of text chat-based virtual assistants, the assistant was often represented by an
avatar
Avatar (, ; ) is a concept within Hinduism that in Sanskrit literally means . It signifies the material appearance or incarnation of a powerful deity, or spirit on Earth. The relative verb to "alight, to make one's appearance" is sometimes u ...
(a.k.a. ''interactive online character'' or ''automated character'') — this was known as an
embodied agent
In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically ...
.
Economic relevance
For individuals
Digital experiences enabled by virtual assistants are considered to be among the major recent technological advances and most promising consumer trends. Experts claim that digital experiences will achieve a status-weight comparable to 'real' experiences, if not become more sought-after and prized. The trend is verified by a high number of frequent users and the substantial growth of worldwide user numbers of virtual digital assistants. In mid-2017, the number of frequent users of digital virtual assistants is estimated to be around 1 bn worldwide. In addition, it can be observed that virtual digital assistant technology is no longer restricted to smartphone applications, but present across many industry sectors (incl.
automotive, telecommunications,
retail
Retail is the sale of goods and services to consumers, in contrast to wholesaling, which is the sale to business or institutional customers. A retailer purchases goods in large quantities from manufacturers, directly or through a wholes ...
, healthcare and education).
In response to the significant R&D expenses of firms across all sectors and an increasing implementation of mobile devices, the market for speech recognition technology is predicted to grow at a
CAGR of 34.9% globally over the period of 2016 to 2024 and thereby surpass a global market size of US$7.5 billion by 2024.
According to an
Ovum study, the "native digital assistant installed base" is projected to exceed the world's population by 2021, with 7.5 billion active voice AI–capable devices.
According to Ovum, by that time "Google Assistant will dominate the voice AI–capable device market with 23.3% market share, followed by Samsung's Bixby (14.5%), Apple's Siri (13.1%), Amazon's Alexa (3.9%), and Microsoft's Cortana (2.3%)."
Taking into consideration the regional distribution of market leaders, North American companies (e.g.
Nuance Communications,
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
,
eGain) are expected to dominate the industry over the next years, due to the significant impact of BYOD (
Bring Your Own Device
Bring your own device (BYOD ) (also called bring your own technology (BYOT), bring your own phone (BYOP), and bring your own personal computer (BYOPC)) refers to being allowed to use one's personally owned device, rather than being required to use ...
) and enterprise mobility business models. Furthermore, the increasing demand for smartphone-assisted platforms are expected to further boost the North American intelligent virtual assistant (IVA) industry growth. Despite its smaller size in comparison to the North American market, the intelligent virtual assistant industry from the
Asia-Pacific region, with its main players located in India and China is predicted to grow at an annual growth rate of 40% (above global average) over the 2016–2024 period.
Economic opportunity for enterprises
Virtual assistants should not be only seen as a gadget for individuals, as they could have a real economic utility for enterprises. As an example, a virtual assistant can take the role of an always available assistant with an encyclopedic knowledge. And which can organize meetings, check inventories, verify informations. Virtual assistants are all the more important that their integration in small and middle-sized enterprises often consists in an easy first step through the more global adaptation and use of
Internet of Things (IoT). Indeed, IoT technologies are first perceived by small and medium-sized enterprises as technologies of critical importance, but too complicated, risky or costly to be used.
Security
In May 2018, researchers from the
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California), is a Public university, public Land-grant university, land-grant research university in Berkeley, California, United States. Founded in 1868 and named after t ...
, published a paper that showed audio commands undetectable for the human ear could be directly embedded into music or spoken text, thereby manipulating virtual assistants into performing certain actions without the user taking note of it.
The researchers made small changes to audio files, which cancelled out the sound patterns that speech recognition systems are meant to detect. These were replaced with sounds that would be interpreted differently by the system and command it to dial phone numbers, open websites or even transfer money.
The possibility of this has been known since 2016,
and affects devices from Apple, Amazon and Google.
In addition to unintentional actions and voice recording, another security and privacy risk associated with intelligent virtual assistants is malicious voice commands: An attacker who impersonates a user and issues malicious voice commands to, for example, unlock a smart door to gain unauthorized entry to a home or garage or order items online without the user's knowledge. Although some IVAs provide a voice-training feature to prevent such impersonation, it can be difficult for the system to distinguish between similar voices. Thus, a malicious person who is able to access an IVA-enabled device might be able to fool the system into thinking that they are the real owner and carry out criminal or mischievous acts.
Comparison of notable assistants
See also
*
Applications of artificial intelligence
*
Artificial conversational entity
*
Artificial human companion
*
Autonomous agent
*
Computer facial animation
*
Expert system
In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert.
Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as ...
*
Friendly artificial intelligence
*
Home network
Home Network is a Television in Canada, Canadian English-language Discretionary service, discretionary cable television, cable and satellite television, satellite specialty channel owned by Corus Entertainment. Home Network broadcasts programs r ...
*
Hybrid intelligent system
Hybrid intelligent system denotes a software system which employs, in parallel, a combination of methods and techniques from artificial intelligence subfields, such as:
* Neuro-symbolic systems
* Neuro-fuzzy systems
* Hybrid connectionist-symbol ...
*
Intelligent agent
In artificial intelligence, an intelligent agent is an entity that Machine perception, perceives its environment, takes actions autonomously to achieve goals, and may improve its performance through machine learning or by acquiring knowledge r ...
*
Interactions Corporation
*
Knowledge Navigator
*
Office Assistant
The Office Assistant is a discontinued intelligent user interface for Microsoft Office that assisted users by way of an interactive animated character which user interface, interfaced with the Office help content. It was included in Microsoft Off ...
*
Multi-agent system
*
Simulation hypothesis
The simulation hypothesis proposes that what one experiences as the real world is actually a simulated reality, such as a computer simulation in which humans are constructs. There has been much debate over this topic in the Philosophy, philosophi ...
*
Social bot
*
Social data revolution
*
Software bot
*
Wizard (software)
A software wizard or setup assistant or multi-step form is a user interface that leads a user through a sequence of small steps, such as a dialog box to configure a program for the first time. They are used to make complex, unfamiliar tasks easier ...
References
{{Authority control
*
Agent-based software
Customer service
*