ElevenLabs is a software company that specializes in developing natural-sounding
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
software using
deep learning.
It has been recognized as one of the major companies behind the ongoing
AI boom
The AI boom, or AI spring, is the ongoing period of rapid progress in the field of artificial intelligence. Prominent examples include protein folding prediction and generative AI, led by laboratories including Google DeepMind and OpenAI.
...
.
History
ElevenLabs was co-founded in 2022 by
Piotr Dąbkowski, an ex-
Google
Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
machine learning engineer and Mateusz Staniszewski, an ex-
Palantir deployment strategist. Both were raised in Poland, and their inspiration for founding ElevenLabs reportedly came from watching inadequately
dubbed American films.
Dąbkowski and Staniszewski initially considered different funding options, including the possibility of collaborating with a startup accelerator. In January 2023 they revealed having secured a $2 million pre-seed round. The startup's specialization in AI voice intelligence, a still-emerging field in
Europe
Europe is a large peninsula conventionally considered a continent in its own right because of its great physical size and the weight of its history and traditions. Europe is also considered a subcontinent of Eurasia and it is located enti ...
, played a significant role in attracting investors. The pre-seed funding was primarily led by Credo Ventures, and joined by Concept Ventures.
In January 2023, ElevenLabs publicly released its
beta platform.
In June 2023, ElevenLabs raised a $19 million
Series A
A series A round (also known as series A financing or series A investment) is the name typically given to a company's first significant round of venture capital financing. The name refers to the class of preferred stock sold to investors in exchan ...
funding round at a valuation of about $100 million,
despite the company having no office and only 15 employees.
The funding round was co-led by the venture capital firm
Andreessen Horowitz
Andreessen Horowitz (also called a16z, legal name AH Capital Management, LLC) is a private American venture capital firm, founded in 2009 by Marc Andreessen and Ben Horowitz. The company is headquartered in Menlo Park, California.
Andreessen H ...
, former GitHub CEO
Nat Friedman
Nathaniel Dourif Friedman is an American technology executive and investor. He was the chief executive officer (CEO) of GitHub, and former Chairman of the GNOME Foundation. Friedman is currently a board member at the Arc Institute, and an advisor ...
, and entrepreneur
Daniel Gross Daniel Gross may refer to:
* Daniel Gross (journalist)
* Daniel Gross (software entrepreneur)
See also
* Daniel J. Gross Catholic High School
Daniel J. Gross Catholic High School is a secondary school owned by the Archdiocese of Omaha and spons ...
. It also saw participation from prominent individuals such as
SV Angel
Ronald Crawford Conway (born March 9, 1951) is an American venture capitalist and philanthropist. He has been described as one of Silicon Valley's " super angels".
Early career
Conway graduated from San Jose State University with a bachelor's ...
,
Mike Krieger
Michel Krieger (born March 4, 1986) is a Brazilian-American entrepreneur and software engineer who co-founded Instagram along with Kevin Systrom, and served as its CTO. Instagram expanded from a few million users to 1 billion monthly active users ...
(co-founder of Instagram),
Brendan Iribe
Brendan Trexler Iribe (; born August 12, 1979) is an American game programmer, entrepreneur and the original CEO and co-founder of Oculus VR, Inc. and Scaleform. He is the managing partner at BIG Ventures, an early-stage venture fund.
Early li ...
(co-founder of Oculus),
Mustafa Suleyman
Mustafa Suleyman (born August 1984) is the co-founder and former head of applied AI at DeepMind, an artificial intelligence company acquired by Google and now owned by Alphabet. His current venture is Inflection AI.
Early life
Suleyman's fath ...
(co-founder of Deepmind), and
Tim O'Reilly
Tim O'Reilly (born 6 June 1954) is the founder of O'Reilly Media (formerly O'Reilly & Associates). He popularised the terms open source and Web 2.0.
Education and early life
Born in County Cork, Ireland, Tim O'Reilly moved to San Francisco, Ca ...
(founder of O'Reilly Media). It was also announced that Andreessen Horowitz would be joining ElevenLabs' board.
On January 22, 2024, ElevenLabs raised an additional $80 million in
Series B
A venture round is a type of funding round used for venture capital financing, by which startup companies obtain investment, generally from venture capitalists and other institutional investors. The availability of venture funding is among the ...
funding raising the total valuation of the company to $1.1 billion. The funding round was led by Andreessen Horowitz, Friedman, Gross, and
Sequoia Capital
Sequoia Capital is an American venture capital firm. The firm is headquartered in Menlo Park, California, and specializes in seed stage, early stage, and growth stage investments in private companies across technology sectors. , Sequoia's total a ...
. Additionally, the company announced a series of new products, including their Voice Marketplace, AI Dubbing Studio, and mobile app.
Products
ElevenLabs is primarily known for its
browser-based
A web application (or web app) is application software that is accessed using a web browser. Web applications are delivered on the World Wide Web to users with an active network connection.
History
In earlier computing models like client-se ...
, AI-assisted text-to-speech software, Speech Synthesis, which can produce lifelike speech by synthesizing
vocal emotion
Emotional prosody or affective prosody is the various non-verbal aspects of language that allow people to convey or understand emotion. It includes an individual's tone of voice in speech that is conveyed through changes in pitch, loudness, timbr ...
and
intonation. The company states that its models are trained to interpret the context in the text, and adjust the intonation and pacing accordingly.
It uses advanced algorithms to analyze the contextual aspects of text, aiming to detect emotions like anger, sadness, happiness, or alarm, which enables the system to understand the user's sentiment, resulting in achieving a more realistic and human-like inflection. The startup is in the process of patenting this technology.
On its beta site, users can submit text and generate
audio files
An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, of ...
from a selection of default voices. Paying users are given the ability to upload custom voice samples to create new vocal styles using the company's voice cloning tool.
Voice Library is the company's feature for sharing unique voice profiles created using their Voice Design technology. These pre-designed voice profiles allow users to select a voice that best suits their needs, rather than creating one from scratch. There are now more than 1,000 community-created voices in the library. Another tool called VoiceLab allows users to clone voices from just a few short snippets of audio and can create entirely new synthetic voices.
On 20 June 2023, ElevenLabs released an AI recognition tool called the AI Speech Classifier, which it claims is the first of its kind.
The tool is accessible through an
API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
and designed to determine if an uploaded audio sample originates from ElevenLabs' proprietary AI technology.
The company has expressed its intention to collaborate with other AI developers in creating a universal detection system that could be adopted industry-wide.
In July 2023, ElevenLabs announced "Projects", a tool for creating long-form spoken content such as audiobooks and dialogue segments with contextually-aware synthetic or custom voices.
The tool was released in September. In August, ElevenLabs expanded its voice generation capabilities to 28 languages. Using an in-house AI model, it automatically detects languages like
Korean
Korean may refer to:
People and culture
* Koreans, ethnic group originating in the Korean Peninsula
* Korean cuisine
* Korean culture
* Korean language
**Korean alphabet, known as Hangul or Chosŏn'gŭl
**Korean dialects and the Jeju language
** ...
,
Dutch
Dutch commonly refers to:
* Something of, from, or related to the Netherlands
* Dutch people ()
* Dutch language ()
Dutch may also refer to:
Places
* Dutch, West Virginia, a community in the United States
* Pennsylvania Dutch Country
People E ...
, and
Vietnamese
Vietnamese may refer to:
* Something of, from, or related to Vietnam, a country in Southeast Asia
** A citizen of Vietnam. See Demographics of Vietnam.
* Vietnamese people, or Kinh people, a Southeast Asian ethnic group native to Vietnam
** Overse ...
, allowing for "emotionally rich" multilingual speech generation. The company also announced that its technology had officially exited its
beta phase.
In October 2023, ElevenLabs presented "AI Dubbing," a tool that is able to translate speech into more than 20 languages. The feature is capable of preserving the speaker's original voice, emotions, and intonation, by employing proprietary methods to handle tasks like noise removal, speaker differentiation, transcription, and synchronization of translated speech with the original audio.
In May 2024, ElevenLabs launched a
text-to-music
Artificial intelligence and music (AIM) is a common subject in the International Computer Music Conference, the Computing Society Conference and the International Joint Conference on Artificial Intelligence. The first International Computer Music ...
model. In June 2024, ElevenLabs released the ElevenLabs Reader App on iOS and Android which allows users to listen to articles, PDFs, and ePubs with AI Voices on their phone. In July 2024, ElevenLabs released "Voice Isolator" which removes background noise from audio.
Uses
ElevenLabs' use cases span a range of sectors.
Content creators have used ElevenLabs for podcasts, narration, and comedy shows. In March 2023, comedian
Drew Carey
Drew Allison Carey (born May 23, 1958) is an American comedian, actor and game show host. After serving in the U.S. Marine Corps and making a name for himself in stand-up comedy, he gained stardom in his own sitcom, '' The Drew Carey Show'', an ...
used ElevenLabs' voice cloning tool to recreate his voice for an episode of his radio show, ''Friday Night Freakout''.
In April 2023,
Polish
Polish may refer to:
* Anything from or related to Poland, a country in Europe
* Polish language
* Poles, people from Poland or of Polish descent
* Polish chicken
*Polish brothers (Mark Polish and Michael Polish, born 1970), American twin screenwr ...
TV and radio presenter
Jaroslaw Kuzniar used a synthesized version of his voice to deliver a series of podcasts on the
Russian invasion of Ukraine
On 24 February 2022, in a major escalation of the Russo-Ukrainian War, which began in 2014. The invasion has resulted in tens of thousands of deaths on both sides. It has caused Europe's largest refugee crisis since World War II. ...
.
Seth Godin
Seth W. Godin is an American author and former dot com business executive.
Background
After leaving Spinnaker in 1986, he used $20,000 in savings to found Seth Godin Productions, primarily a book packaging business, out of a studio apartment in ...
has also used ElevenLabs to narrate his AI-focused podcast.
Tim Green, former NFL player and author, utilizes ElevenLabs' AI voice cloning technology for his podcast, "Tim Green's Nothing Left Unsaid." Diagnosed with a slow-progressing form of
amyotrophic lateral sclerosis
Amyotrophic lateral sclerosis (ALS), also known as motor neuron disease (MND) or Lou Gehrig's disease, is a neurodegenerative disease that results in the progressive loss of motor neurons that control voluntary muscles. ALS is the most comm ...
, Green's ability to communicate verbally has been affected. The AI technology allows him to host and engage in deep conversations with various guests by using a cloned version of his voice from earlier recordings. This innovative use of ElevenLabs' technology enables Green to continue contributing to important discussions despite his condition.
In March 2023, Super-Hi-Fi, a streaming automation service, partnered with ElevenLabs to launch a fully automated radio service called "AI Radio", using ElevenLabs' software to voice its virtual DJ from prompts generated with
ChatGPT
ChatGPT (Generative Pre-trained Transformer) is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning) with both supervised and ...
. ElevenLabs has also been employed for narrating games and voicing game characters in partnerships with Swedish game developer
Paradox Interactive
Paradox Interactive AB is a video game publisher based in Stockholm, Sweden. The company started out as the video game division of Target Games and then Paradox Entertainment (now Cabinet Entertainment) before being spun out into an indep ...
and the United Kingdom-based Magicave.
Publishers and authors have used ElevenLabs to narrate audiobooks and newsletters.
On 13 June 2023,
Storytel
Storytel AB is a Stockholm-based e-book and audiobook subscription service.
It compares with Audible using a monthly credit model, and is available in more than 25 countries. Its English audiobook service Audiobooks.com is available in more tha ...
announced an exclusive partnership with ElevenLabs. Through this collaboration, ElevenLabs will create voices tailored specifically to Storytel's core markets and to produce AI-narrated audiobooks. A voice-changing feature called VoiceSwitcher was implemented to enhance personalization for users, providing unique listening experiences customized for each individual.
ElevenLabs has been used to generate audio for dubbing videos in different languages, including by content creators.
The platform has the capability to accurately replicate almost any accent in any language. Celebrity fans have used ElevenLabs to create inspirational messages using the voices of their favorite celebrities.
In February,
VICE
A vice is a practice, behaviour, or Habit (psychology), habit generally considered immorality, immoral, sinful, crime, criminal, rude, taboo, depraved, degrading, deviant or perverted in the associated society. In more minor usage, vice can refe ...
reporter Joseph Cox published findings that he had recorded five minutes of himself talking and then used ElevenLabs to create voice deepfakes that defeated a bank's
voice-authentication system.
In July, U.S. Representative Jennifer Wexton used ElevenLabs to create a replica of her voice after losing hers to Parkinson’s Disease-like Progressive Supranuclear Palsy (PSP).
ElevenLabs sets explicit guidelines regarding the use of its technology, forbidding the cloning of voices for abusive purposes such as fraud, discrimination, hate speech, or online abuse, although it does support the use of its platform for “caricature, parody and satire” and “artistic and political speech contributing to public debates." The company asserts its authority to suspend the accounts and content of users found in violation of these guidelines, and it also highlights its commitment to cooperate with authorities and report any illegal activities in accordance with applicable laws.
In January, the company admitted that its platform has been used for “voice cloning misuse cases” and toughened its safeguards against vexatious use of its technology.
Reception
Following its launch in January 2023, ElevenLabs gained rapid momentum and was commended for its voice output quality, fast generation times, and a "generous free tier". It has also been praised for its ability to accurately pronounce names with unique or uncommon pronunciations, addressing a common shortcoming in similar tools that often cater primarily to Western names. The company reached over one million registered users between its launch and June 2023.
Criticism and controversy
ElevenLabs was criticized after users were able to abuse its software to generate controversial statements in the vocal style of celebrities, public officials, and other famous individuals,
particularly attracting attention after users on
4chan
4chan is an anonymous English-language imageboard website. Launched by Christopher "moot" Poole in October 2003, the site hosts boards dedicated to a wide variety of topics, from anime and manga to video games, cooking, weapons, television, ...
used the tool to share hateful messages.
The software's ability to closely replicate real voices has raised
ethical concerns, with critics likening it to
deepfaking. In response, the company said it would work on mitigating potential abuse through safeguards and
identity verification An identity verification service is used by businesses to ensure that users or customers provide information that is associated with the identity of a real person. The service may verify the authenticity of physical identity documents such as a driv ...
.
The company has subsequently limited access to its voice cloning feature to paid subscribers, citing the requirement to provide payment information as means for improving accountability, and has implemented bans on users who repeatedly violate the terms of service.
In the leadup to the
January 2024 New Hampshire democratic primary, AI-generated robocalls purportedly from
Joe Biden encouraging voters to skip voting on the day of the primary were sent to thousands of residents. The New Hampshire attorney general's office launched an investigation into the incident and linked it to a company based in Texas, with audio experts concluding the call was made using ElevenLabs. In response to the incident, CEO Mati Staniszewski stated that the company was “dedicated to preventing the misuse of audio AI tools” but provided no comment on specific incidents.
Additional concerns have been raised over the ethics of the source of ElevenLabs' training data, with multiple
voice actors
Voice acting is the art of performing voice-overs to present a character or provide information to an audience. Performers are called voice actors/actresses, voice artists, dubbing artists, voice talent, voice-over artists, or voice-over talent ...
claiming ElevenLabs used samples of their voices without their consent.
ElevenLabs, along with other companies in its category, has thus been seen as a potential challenge to the voice acting sector.
See also
*
15.ai
15.ai is a non-commercial freeware artificial intelligence web application that generates natural emotive high-fidelity text-to-speech voices from an assortment of fictional characters from a variety of media sources. Developed by an anonymous ...
*
Respeecher
Respeecher is a Ukrainian software company developing speech synthesis software enabling one person to speak in the voice of another particular person using artificial intelligence.
History
Respeecher was founded in February 2018. A year late ...
References
External links
*{{Official website, https://elevenlabs.io/
Speech synthesis
Artificial intelligence laboratories
2022 establishments in New York City
Software companies based in New York City
Technology companies based in New York City