MusicLM
   HOME

TheInfoList



OR:

Music and artificial intelligence (music and AI) is the development of
music software This is a list of software for creating, performing, learning, analyzing, researching, broadcasting and editing music. This article only includes software, not services. For streaming services such as iHeartRadio, Pandora, Prime Music, and Spoti ...
programs which use AI to generate music. As with applications in other fields, AI in music also simulates mental tasks. A prominent feature is the capability of an AI algorithm to learn based on past data, such as in computer accompaniment technology, wherein the AI is capable of listening to a human performer and performing accompaniment. Artificial intelligence also drives interactive composition technology, wherein a computer composes music in response to a live performance. There are other AI applications in music that cover not only music composition, production, and performance but also how music is marketed and consumed. Several music player programs have also been developed to use voice recognition and natural language processing technology for music voice control. Current research includes the application of AI in
music composition Musical composition can refer to an original piece or work of music, either vocal or instrumental, the structure of a musical piece or to the process of creating or writing a new piece of music. People who create new compositions are called ...
,
performance A performance is an act or process of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Performance has evolved glo ...
, theory and digital
sound processing Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting ...
. Composers/artists like
Jennifer Walshe Jennifer Walshe (born 1 June 1974) is an Irish composer, vocalist and artist. Biography Jennifer Walshe was born in Dublin, Ireland, in 1974. She studied composition with John Maxwell Geddes at the Royal Scottish Academy of Music and Drama ...
or
Holly Herndon Holly Herndon (born 1980) is an American artist and composer based in Berlin, Germany. After studying composition at Stanford University and completing her Ph.D. at Stanford University's Center for Computer Research in Music and Acoustics, she ...
have been exploring aspects of music AI for years in their performances and musical works. Another original approach of humans “imitating AI” can be found in the 43-hour sound installation ''
String Quartet(s) ''String Quartet(s)'' (2000–2023) is a digital four-channel surround-sound composition by Luxembourg-Australian composer Georges Lentz. It is over 43 hours long and plays constantly, day and night, in its permanent sound installation setting, ...
'' by
Georges Lentz Georges Lentz is a contemporary composer and sound artist born in Luxembourg in 1965 and that country's internationally best known composer. Since 1990, he has been living in Sydney, Australia. Despite his relatively small output and his reclusi ...
(see interview with
ChatGPT-4 ChatGPT is a generative artificial intelligence chatbot developed by OpenAI and released on November 30, 2022. It uses large language models (LLMs) such as GPT-4o as well as other multimodal models to create human-like responses in text, spe ...
on music and AI). 20th century art historian
Erwin Panofsky Erwin Panofsky (March 30, 1892 – March 14, 1968) was a German-Jewish art historian whose work represents a high point in the modern academic study of iconography, including his hugely influential ''Renaissance and Renascences in Western Art ...
proposed that in all art, there existed three levels of meaning: primary meaning, or the natural subject; secondary meaning, or the conventional subject; and tertiary meaning, the intrinsic content of the subject. AI music explores the foremost of these, creating music without the "intention" which is usually behind it, leaving composers who listen to machine-generated pieces feeling unsettled by the lack of apparent meaning.


History

In the 1950s and the 1960s, music made by artificial intelligence was not fully original, but generated from templates that people had already defined and given to the AI, with this being known as
rule-based system In computer science, a rule-based system is a computer system in which domain-specific knowledge is represented in the form of rules and general-purpose reasoning is used to solve problems in the domain. Two different kinds of rule-based systems ...
s. As time passed, computers became more powerful, which allowed machine learning and artificial neural networks to help in the music industry by giving AI large amounts of data to learn how music is made instead of predefined templates. By the early 2000s, more advancements in artificial intelligence had been made, with
generative adversarial network A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...
s (GANs) and
deep learning Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
being used to help AI compose more original music that is more complex and varied than possible before. Notable AI-driven projects, such as OpenAI’s MuseNet and Google’s Magenta, have demonstrated AI’s ability to generate compositions that mimic various musical styles.


Timeline

Artificial intelligence finds its beginnings in music with the transcription problem: accurately recording a performance into musical notation as it is played. Père Engramelle's schematic of a "piano roll", a mode of automatically recording note timing and duration in a way which could be easily transcribed to proper musical notation by hand, was first implemented by German engineers J.F. Unger and J. Hohlfield in 1952. In 1957, the ILLIAC I (Illinois Automatic Computer) produced the "Illiac Suite for String Quartet", a completely computer-generated piece of music. The computer was programmed to accomplish this by composer
Leonard Isaacson Leonard Maxwell Isaacson (December 15, 1925 – July 1, 2018) was an American chemist and composer. Isaacson was born in Chicago, Illinois on December 15, 1925. He collaborated with Lejaren Hiller on the computer-programmed acoustic composition, ' ...
and mathematician
Lejaren Hiller Lejaren Arthur Hiller Jr. (February 23, 1924, New York City – January 26, 1994, Buffalo, New York)Lejaren ...
. In 1960, Russian researcher Rudolf Zaripov published worldwide first paper on algorithmic music composing using the Ural-1 computer. In 1965, inventor
Ray Kurzweil Raymond Kurzweil ( ; born February 12, 1948) is an American computer scientist, author, entrepreneur, futurist, and inventor. He is involved in fields such as optical character recognition (OCR), speech synthesis, text-to-speech synthesis, spee ...
developed software capable of recognizing musical patterns and synthesizing new compositions from them. The computer first appeared on the quiz show '' I've Got a Secret'' that same year. By 1983,
Yamaha Corporation is a Japanese multinational musical instrument and audio equipment manufacturer. It is one of the constituents of Nikkei 225 and is the world's largest musical instrument manufacturing company. The former motorcycle division was establishe ...
's Kansei Music System had gained momentum, and a paper was published on its development in 1989. The software utilized music information processing and artificial intelligence techniques to essentially solve the transcription problem for simpler melodies, although higher-level melodies and musical complexities are regarded even today as difficult deep-learning tasks, and near-perfect transcription is still a subject of research. In 1997, an artificial intelligence program named Experiments in Musical Intelligence (EMI) appeared to outperform a human composer at the task of composing a piece of music to imitate the style of
Bach Johann Sebastian Bach (German: joːhan zeˈbasti̯an baχ ( – 28 July 1750) was a German composer and musician of the late Baroque period. He is known for his prolific output across a variety of instruments and forms, including the or ...
. EMI would later become the basis for a more sophisticated algorithm called Emily Howell, named for its creator. In 2002, the music research team at the Sony Computer Science Laboratory Paris, led by French composer and scientist François Pachet, designed the Continuator, an algorithm uniquely capable of resuming a composition after a live musician stopped. Emily Howell would continue to make advancements in musical artificial intelligence, publishing its first album ''From Darkness, Light'' in 2009. Since then, many more pieces by artificial intelligence and various groups have been published. In 2010,
Iamus In Greek mythology, Iamus (Ancient Greek: Ἴαμος) was the son of Apollo and Evadne, a daughter of Poseidon, raised by Aepytus. In a story told by Pindar, after his mother lies with Apollo and the child is born, he is left in the wilderness. ...
became the first AI to produce a fragment of original contemporary classical music, in its own style: "Iamus' Opus 1". Located at the Universidad de Malága (Malága University) in Spain, the computer can generate a fully original piece in a variety of musical styles. In August 2019, a large dataset consisting of 12,197 MIDI songs, each with their lyrics and melodies, was created to investigate the feasibility of neural melody generation from lyrics using a deep conditional LSTM-GAN method. With progress in
generative AI Generative artificial intelligence (Generative AI, GenAI, or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models learn the underlying patterns and str ...
, models capable of creating complete musical compositions (including lyrics) from a simple text description have begun to emerge. Two notable web applications in this field are
Suno AI Suno AI, or simply Suno, is a generative artificial intelligence Music and artificial intelligence, music creation program designed to generate realistic songs that combine vocals and instrumentation, or are purely instrumental. Suno has been wi ...
, launched in December 2023, and
Udio Udio is a generative artificial intelligence model that produces music based on simple text prompts. It can generate vocals and instrumentation. Its free beta version was released publicly on April 10, 2024. Users can pay to subscribe monthly o ...
, which followed in April 2024.


Software applications


ChucK

Developed at Princeton University by Ge Wang and Perry Cook, ChucK is a text-based, cross-platform language. By extracting and classifying the theoretical techniques it finds in musical pieces, the software is able to synthesize entirely new pieces from the techniques it has learned. The technology is used by
SLOrk A laptop orchestra (lork or LO) or laptop ensemble (LE) is a chamber music musical ensemble, ensemble consisting primarily of laptops. Education based laptop orchestras include SCLOrk (Santa Clara University Laptop Orchestra), BLOrk (University ...
(Stanford Laptop Orchestra) and
PLOrk A laptop orchestra (lork or LO) or laptop ensemble (LE) is a chamber music ensemble consisting primarily of laptops. Education based laptop orchestras include SCLOrk (Santa Clara University Laptop Orchestra), BLOrk (University of Colorado Boul ...
(Princeton Laptop Orchestra).


Jukedeck

Jukedeck was a website that let people use artificial intelligence to generate original, royalty-free music for use in videos. The team started building the music generation technology in 2010, formed a company around it in 2012, and launched the website publicly in 2015. The technology used was originally a rule-based
algorithmic composition Algorithmic composition is the technique of using algorithms to create music. Algorithms (or, at the very least, formal sets of rules) have been used to compose music for centuries; the procedures used to plot voice-leading in Western counterpo ...
system, which was later replaced with
artificial neural networks In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks. A neural network consists of connected ...
. The website was used to create over 1 million pieces of music, and brands that used it included
Coca-Cola Coca-Cola, or Coke, is a cola soft drink manufactured by the Coca-Cola Company. In 2013, Coke products were sold in over 200 countries and territories worldwide, with consumers drinking more than 1.8 billion company beverage servings ...
,
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
,
UKTV UKTV Media Limited, trading as UKTV, is a British multi-channel broadcaster, which, since 2019, has been wholly owned by BBC Studios (formerly BBC Worldwide), a commercial subsidiary of the BBC. It was formed on 1 November 1992 through a join ...
, and the
Natural History Museum, London The Natural History Museum in London is a museum that exhibits a vast range of specimens from various segments of natural history. It is one of three major museums on Exhibition Road in South Kensington, the others being the Science Museum (Lo ...
. In 2019, the company was acquired by
ByteDance ByteDance Ltd. is a Chinese internet technology company headquartered in Haidian, Beijing, and incorporated in the Cayman Islands. Founded by Zhang Yiming, Liang Rubo, and a team of others in 2012, ByteDance developed the video-sharing ap ...
.


MorpheuS

MorpheuS is a research project by Dorien Herremans and Elaine Chew at
Queen Mary University of London Queen Mary University of London (QMUL, or informally QM, and formerly Queen Mary and Westfield College) is a public university, public research university in Mile End, East London, England. It is a member institution of the federal University ...
, funded by a Marie Skłodowská-Curie EU project. The system uses an optimization approach based on a
variable neighborhood search Variable neighborhood search (VNS), proposed by Mladenović & Hansen in 1997, is a metaheuristic method for solving a set of combinatorial optimization and global optimization problems. It explores distant neighborhoods of the current incumbent so ...
algorithm to morph existing template pieces into novel pieces with a set level of tonal tension that changes dynamically throughout the piece. This optimization approach allows for the integration of a pattern detection technique in order to enforce long term structure and recurring themes in the generated music. Pieces composed by MorpheuS have been performed at concerts in both Stanford and London.


AIVA

Created in February 2016, in
Luxembourg Luxembourg, officially the Grand Duchy of Luxembourg, is a landlocked country in Western Europe. It is bordered by Belgium to the west and north, Germany to the east, and France on the south. Its capital and most populous city, Luxembour ...
, AIVA is a program that produces soundtracks for any type of media. The algorithms behind AIVA are based on deep learning architectures AIVA has also been used to compose a Rock track called ''On the Edge'', as well as a pop tune ''Love Sick'' in collaboration with singer
Taryn Southern Taryn Southern is an American public speaker, artist, and brand strategist who works with emerging technologies. She was formerly known for her work as a TV host, actress and YouTuber. Early life Southern grew up in Wichita, Kansas and landed her ...
, for the creation of her 2018 album "I am AI".


Google Magenta

Google's Magenta team has published several AI music applications and technical papers since their launch in 2016. In 2017 they released the NSynth algorithm and dataset, and an
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
hardware musical instrument, designed to facilitate musicians in using the algorithm. The instrument was used by notable artists such as
Grimes Claire Elise Boucher (; born March 17, 1988), known professionally as Grimes, is a Canadian musician, singer, songwriter, and record producer. Her lyrics often touch on science fiction and feminist themes. The visuals in her videos are elabora ...
and
YACHT A yacht () is a sail- or marine propulsion, motor-propelled watercraft made for pleasure, cruising, or racing. There is no standard definition, though the term generally applies to vessels with a cabin intended for overnight use. To be termed a ...
in their albums. In 2018, they released a piano improvisation app called Piano Genie. This was later followed by Magenta Studio, a suite of 5 MIDI plugins that allow music producers to elaborate on existing music in their DAW. In 2023, their machine learning team published a technical paper on GitHub that described MusicLM, a private text-to-music generator which they'd developed.


Riffusion


Spike AI

Spike AI is an AI-based
audio plug-in An audio plug-in, in computer software, is a Plug-in (computing), plug-in that can add or enhance audio-related functions in a computer program, typically a digital audio workstation. Such functions may include digital signal processing or soun ...
, developed by
Spike Stent Mark "Spike" Stent (born 3 August 1965) is an English record producer and mixing engineer who has worked with many international artists including Madonna, Marshmello, U2, Beyoncé, Björk, Depeche Mode, Echo & the Bunnymen, Grimes, Ed Sheeran ...
in collaboration with his son Joshua Stent and friend Henry Ramsey, that analyzes tracks and provides suggestions to increase clarity and other aspects during mixing. Communication is done by using a
chatbot A chatbot (originally chatterbot) is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of main ...
trained on Spike Stent's personal data. The plug-in integrates into
digital audio workstation A digital audio workstation (DAW ) is an electronic device or application software used for Sound recording and reproduction, recording, editing and producing audio files. DAWs come in a wide variety of configurations from a single software pr ...
.


Musical applications

Artificial intelligence can potentially impact how producers create music by giving reiterations of a track that follow a prompt given by the creator. These prompts allow the AI to follow a certain style that the artist is trying to go for. AI has also been seen in musical analysis where it has been used for feature extraction, pattern recognition, and musical recommendations. New tools that are powered by artificial intelligence have been made to help aid in generating original music compositions, like AIVA (Artificial Intelligence Virtual Artist) and
Udio Udio is a generative artificial intelligence model that produces music based on simple text prompts. It can generate vocals and instrumentation. Its free beta version was released publicly on April 10, 2024. Users can pay to subscribe monthly o ...
. This is done by giving an AI model data of already-existing music and having it analyze the data using deep learning techniques to generate music in many different genres, such as classical music or electronic music.


Ethical and legal considerations

Several musicians such as
Dua Lipa Dua Lipa ( ; born22 August 1995) is an English and Albanian singer, songwriter and actress. List of awards and nominations received by Dua Lipa, Her accolades include seven Brit Awards and three Grammy Awards. Lipa worked as a model before v ...
,
Elton John Sir Elton Hercules John (born Reginald Kenneth Dwight; 25 March 1947) is a British singer, songwriter and pianist. His music and showmanship have had a significant, lasting impact on the music industry, and his songwriting partnership with l ...
,
Nick Cave Nicholas Edward Cave (born 22 September 1957) is an Australian musician, writer, and actor who fronts the rock band Nick Cave and the Bad Seeds. Known for his baritone voice, Cave's music is characterised by emotional intensity, a wide variety ...
,
Paul McCartney Sir James Paul McCartney (born 18 June 1942) is an English singer, songwriter and musician who gained global fame with the Beatles, for whom he played bass guitar and the piano, and shared primary songwriting and lead vocal duties with John ...
and
Sting Stimulator of interferon genes (STING), also known as transmembrane protein 173 (TMEM173) and MPYS/MITA/ERIS is a regulator protein that in humans is encoded by the STING1 gene. STING plays an important role in innate immunity. STING induces typ ...
have criticized the use of AI in music and are encouraging the
UK government His Majesty's Government, abbreviated to HM Government or otherwise UK Government, is the central government, central executive authority of the United Kingdom of Great Britain and Northern Ireland.
to act on this matter. Some artists have encouraged the use of AI in music such as
Grimes Claire Elise Boucher (; born March 17, 1988), known professionally as Grimes, is a Canadian musician, singer, songwriter, and record producer. Her lyrics often touch on science fiction and feminist themes. The visuals in her videos are elabora ...
. While helpful in generating new music, many issues have come up since artificial intelligence has begun making music. Some major concerns include how the economy will be impacted with AI taking over music production, who truly owns music generated by AI, and a lower demand for human-made musical compositions. Some critics argue that AI diminishes the value of human creativity, while proponents see it as an augmentative tool that expands artistic possibilities rather than replacing human musicians. Additionally, concerns have been raised about AI's potential to homogenize music. AI-driven models often generate compositions based on existing trends, which some fear could limit musical diversity. Addressing this concern, researchers are working on AI systems that incorporate more nuanced creative elements, allowing for greater stylistic variation. Another major concern about artificial intelligence in music is copyright laws. Many questions have been asked about who owns AI generated music and productions, as today’s copyright laws require the work to be human-authorized in order to be granted copyright protection. One proposed solution is to create hybrid laws that recognize both the artificial intelligence that generated the creation and the humans that contributed to the creation. In the United States, the current legal framework tends to apply traditional copyright laws to AI, despite its differences with the human creative process. However, music outputs solely generated by AI are not granted copyright protection. In the compendium of the U.S. Copyright Office Practices, the Copyright Office has stated that it would not grant copyrights to "works that lack human authorship" and "the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author." In February 2022, the Copyright Review Board rejected an application to copyright AI-generated artwork on the basis that it "lacked the required human authorship necessary to sustain a claim in copyright." The usage of copyrighted music in training AI has also been a topic of contention. One instance of this was seen when
SACEM The Society of Authors, Composers and Publishers of Music or SACEM () is a French professional association collecting payments of artists’ rights and distributing the rights to the original songwriters, composers, and music publisher A mus ...
, a professional organization of songwriters, composers, and music publishers demanded that PozaLabs, an AI music generation startup refrain from utilizing any music affiliated with them for training models. The situation in the European Union (EU) is similar to the US, because its legal framework also emphasizes the role of human involvement in a copyright-protected work.Bulayenko, Oleksandr; Quintais, João Pedro; Gervais, Daniel J.; Poort, Joost (February 28, 2022)
"AI Music Outputs: Challenges to the Copyright Legal Framework"
. ''reCreating Europe Report''. Retrieved 2024-04-03.
According to the
European Union Intellectual Property Office The European Union Intellectual Property Office (EUIPO) () is a decentralised agency of the EU responsible for the registration of EU-wide unitary trade marks and industrial design rights. These exist alongside the intellectual property rig ...
and the recent jurisprudence of the
Court of Justice of the European Union The Court of Justice of the European Union (CJEU) ( or "''CJUE''"; Latin: Curia) is the Judiciary, judicial branch of the European Union (EU). Seated in the Kirchberg, Luxembourg, Kirchberg quarter of Luxembourg City, Luxembourg, this EU ins ...
, the originality criterion requires the work to be the author's own intellectual creation, reflecting the personality of the author evidenced by the creative choices made during its production, requires distinct level of human involvement. The reCreating Europe project, funded by the European Union's Horizon 2020 research and innovation program, delves into the challenges posed by AI-generated contents including music, suggesting legal certainty and balanced protection that encourages innovation while respecting copyright norms. The recognition of AIVA marks a significant departure from traditional views on authorship and copyrights in the realm of music composition, allowing AI artists capable of releasing music and earning royalties. This acceptance marks AIVA as a pioneering instance where an AI has been formally acknowledged within the music production. The recent advancements in artificial intelligence made by groups such as
Stability AI Stability AI Ltd is a UK-based artificial intelligence company, best known for its text-to-image model Stable Diffusion. History and founding Stability AI was founded in 2019 by Emad Mostaque and by Cyrus Hodes. In August 2022 Stability AI r ...
,
OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
, and
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
has incurred an enormous sum of copyright claims leveled against generative technology, including AI music. Should these lawsuits succeed, the machine learning models behind these technologies would have their datasets restricted to the public domain. Strides towards addressing ethical issues have been made as well, such as the collaboration between Sound Ethics(a company promoting ethical AI usage in the music industry) and UC Irvine, focusing on ethical frameworks and the responsible usage of AI.


Musical deepfakes

A more nascent development of AI in music is the application of
audio deepfake Audio deepfake technology, also referred to as voice cloning or deepfake audio, is an application of artificial intelligence designed to generate speech that convincingly mimics specific individuals, often speech synthesis, synthesizing phrases or ...
s to cast the lyrics or musical style of a pre-existing song to the voice or style of another artist. This has raised many concerns regarding the legality of technology, as well as the ethics of employing it, particularly in the context of artistic identity. Furthermore, it has also raised the question of to whom the authorship of these works is attributed. As AI cannot hold authorship of its own, current speculation suggests that there will be no clear answer until further rulings are made regarding machine learning technologies as a whole. Most recently, preventative measures have started to be developed by
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
and Universal Music group who have taken in royalties and credited attribution in order to allow producers to replicate the voices and styles of artists.


"Heart on My Sleeve"

In 2023, an artist known as ghostwriter977 created a musical deepfake called " Heart on My Sleeve" that cloned the voices of
Drake Drake may refer to: Animals and creatures * A male duck * Drake (mythology), a term related to and often synonymous with dragon People and fictional characters * Drake (surname), a list of people and fictional characters with the family ...
and
The Weeknd Abel Makkonen Tesfaye (; born February 16, 1990), known professionally as the Weeknd, is a Canadian singer-songwriter, record producer, and actor. He is best known for adding Pop music, pop, electronic music, electronic and hip-hop stylings ...
 by inputting an assortment of vocal-only tracks from the respective artists into a deep-learning algorithm, creating an artificial model of the voices of each artist, to which this model could be mapped onto original reference vocals with original lyrics. The track was submitted for
Grammy The Grammy Awards, stylized as GRAMMY, and often referred to as The Grammys, are awards presented by The Recording Academy of the United States to recognize outstanding achievements in music. They are regarded by many as the most prestigious a ...
consideration for the best rap song and song of the year. It went viral and gained traction on
TikTok TikTok, known in mainland China and Hong Kong as Douyin (), is a social media and Short-form content, short-form online video platform owned by Chinese Internet company ByteDance. It hosts user-submitted videos, which may range in duration f ...
and received a positive response from the audience, leading to its official release on
Apple Music Apple Music is an audio and video streaming service developed by Apple Inc. Users can select music to stream to their device on-demand, or listen to existing playlists. The service also includes the sister internet radio stations Apple Musi ...
,
Spotify Spotify (; ) is a List of companies of Sweden, Swedish Music streaming service, audio streaming and media service provider founded on 23 April 2006 by Daniel Ek and Martin Lorentzon. , it is one of the largest providers of music streaming services ...
, and
YouTube YouTube is an American social media and online video sharing platform owned by Google. YouTube was founded on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim who were three former employees of PayPal. Headquartered in ...
in April 2023. Many believed the track was fully composed by an AI software, but the producer claimed the songwriting, production, and original vocals (pre-conversion) were still done by him. It would later be rescinded from any Grammy considerations due to it not following the guidelines necessary to be considered for a Grammy award. The track would end up being removed from all music platforms by
Universal Music Group Universal Music Group N.V. (often abbreviated as UMG and referred to as Universal Music Group or Universal Music) is a Netherlands, Dutch–United States, American multinational Music industry, music corporation under Law of the Netherlands, ...
. The song was a watershed moment for AI voice cloning, and models have since been created for hundreds, if not thousands, of popular singers and rappers.


"Where That Came From"

In 2013, country music singer
Randy Travis Randy Bruce Traywick (born May 4, 1959), known professionally as Randy Travis, is an American country and gospel music singer and songwriter, as well as a film and television actor. Active since 1979, he has recorded over 20 studio albums and ...
suffered a
stroke Stroke is a medical condition in which poor cerebral circulation, blood flow to a part of the brain causes cell death. There are two main types of stroke: brain ischemia, ischemic, due to lack of blood flow, and intracranial hemorrhage, hemor ...
which left him unable to sing. In the meantime, vocalist James Dupré toured on his behalf, singing his songs for him. Travis and longtime producer
Kyle Lehning Kyle Lehning is an American record producer whose work is mainly in the field of country music. He has produced virtually every album released by Randy Travis, who described their partnership "an interesting relationship." The only exceptions are ...
released a new song in May 2024 titled " Where That Came From", Travis's first new song since his stroke. The recording uses AI technology to re-create Travis's singing voice, having been composited from over 40 existing vocal recordings alongside those of Dupré.


Technical approaches

Artificial intelligence music encompasses a number of technical approaches used for music composition, analysis, classification, and suggestion. Techniques used are drawn from deep learning, machine learning, natural language processing, and signal processing. Current systems are able to compose entire musical compositions, parse affective content, accompany human players in real-time, and acquire patterns of user and context-dependent preferences.


Symbolic music composition

Symbolic music generation is the generation of music in discrete symbolic forms such as MIDI, where note and timing are precisely defined. Early systems employed rule-based systems and Markov models, but modern systems employ deep learning to a large extent. Recurrent Neural Networks (RNNs), and more precisely Long Short-Term Memory (LSTM) networks, have been employed in modeling temporal dependencies of musical sequences. They may be used to generate melodies, harmonies, and counterpoints in various musical genres. Transformer models such as Music Transformer and MuseNet became more popular for symbolic generation due to their ability to model long-range dependencies and scalability. These models were employed to generate multi-instrument polyphonic music and stylistic imitations.


Audio-based music generation

This method generates music as raw audio waveforms instead of symbolic notation. DeepMind's WaveNet is an early example that uses autoregressive sampling to generate high-fidelity audio. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) are being used more and more in new audio texture synthesis and timbre combination of different instruments. NSynth (Neural Synthesizer), a Google Magenta project, uses a WaveNet-like autoencoder to learn latent audio representations and thereby generate completely novel instrumental sounds.


Music information retrieval (MIR)

Music Information Retrieval (MIR) is the extraction of musically relevant information from audio recordings to be utilized in applications such as genre classification, instrument recognition, mood recognition, beat detection, and similarity estimation. CNNs on spectrogram features have been very accurate on these tasks. SVMs and k-Nearest Neighbors (k-NN) are also used for classification on features such as Mel-frequency cepstral coefficients (MFCCs).


Hybrid and interactive systems

Hybrid systems combine symbolic and sound-based methods to draw on their respective strengths. They can compose high-level symbolic compositions and synthesize them as natural sound. Interactive systems in real-time allow for AI to instantaneously respond to human input to support live performance. Reinforcement learning and rule-based agents tend to be utilized to allow for human–AI co-creation in improvisation contexts.


Affective computing and emotion-aware music systems

Affective computing techniques enable AI systems to classify or create music based on some affective content. The models use musical features such as tempo, mode, and timbre to classify or influence listener emotions. Deep learning models have been trained for classifying music based on affective content and even creating music intended to have affective impacts.


AI-based music recommendation systems

Music recommenders employ AI to suggest tracks to users based on what they have heard, their tastes, and information available in context. Collaborative filtering, content-based filtering, and hybrid filtering are most widely applied, deep learning being utilized for fine-tuning. Graph-based and matrix factorization methods are used within commercial systems like Spotify and YouTube Music to represent complex user-item relationships.


AI for automatic mixing and mastering

AI is also used in audio engineering automation such as mixing and mastering. Such systems level, equalize, pan, and compress to give well-balanced sound outputs. Software such as LANDR and iZotope Ozone utilize machine learning in emulating professional audio engineers' decisions.


Lyrics generation and songwriting aid

Natural language generation also applies to songwriting assistance and lyrics generation. Transformer language models like GPT-3 have also been proven to be able to generate stylistic and coherent lyrics from input prompts, themes, or feeling. There even exist AI programs that assist with rhyme scheme, syllable count, and poem form. .


Multimodal and cross-modal systems

Recent developments include multimodal AI systems that integrate music with other media, e.g., dance, video, and text. These can generate background scores in synchronization with video sequences or generate dance choreography from audio input. Cross-modal retrieval systems allow one to search for music using images, text, or gestures.


Cultural impact

The advent of AI music has caused heated cultural debates, especially its impacts on creativity, morality, and audience. As much as there have been praises about the democratization of music production, there have been fears raised about its impacts on producers, audience, and society in general.


Reactions and controversies

The most contentious application of AI music creation has been its misuse to produce offensive work. The music AI platforms have been used in several instances to produce songs with offensive lyrics that were racist, antisemitic, or contained violence and have tested moderation and accountability in generative AI platforms.[] The case has renewed argument about accountability in users and developers in producing moral outputs in generative models. Aside from that, there have been several producers and artists denouncing the use of AI music due to threats to originality, handmade craftsmanship, and cultural authenticity. The music created by AIs lacks the emotional intelligence and lived life upon which human work relies, according to its critics. The concern comes in an era when there are steadily more songs made from AIs appearing on platforms and which others consider lowering human artistry.
">ref name=":1">


Musicians vs. consumers

Interestingly enough, while professional musicians have been generally more dismissive about using AI in music production, the general consumer or listener has been receptive or neutral to the idea. Surveys have found that in a commercial context, the average consumer often doesn't know or even care whether they hear music made by human beings or AI and that a high percentage says that it doesn't affect their enjoyment. ">ref name=":1" />The contrast between artist sentiment and consumer sentiment may hold far-reaching consequences in terms of the future economics within the music industry and the worth assigned to human creativity.


Public perception and general perception

The cultural value placed on AI music is similarly related to overall popular perceptions regarding generative AI. How generative AI-produced work—whether music or writing—is received in human terms has been found to be dependent upon such factors as emotional meaning and authenticity.[] As long as the output from AI proves persuasive and engaging, audiences may in some cases be willing to accept music whose author is not a human being, with the potential to reshape conventions regarding creators and creativity.


Future directions

The field of music and artificial intelligence is still evolving. Some of the possible key future directions for advancement include advancements in generation models, changes in how humans and AI collaborate musically, and the development of legal and ethical frameworks to address the technology's impact.


Advancements in generation models

Future research and development is expected to move beyond established techniques such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). More recent architectures such as diffusion models and transformer based networks are showing promise for generating more complex, nuanced, and stylistically coherent music. These models may lead to higher quality audio generation and better long term structure in music compositions.


Human-AI collaboration

Besides the act of generation itself, a significant future direction of interest involves deepening the collaboration between human musicians and AI. Developments are increasingly focused on understanding the way these collaborations can occur, and how they can be facilitated to be ethically sound. This involves studying musicians’ perceptions and experiences with AI tools to inform the design of future systems. Research actively explores these collaborative models in different domains. For instance, studies investigate how AI can be co-designed with professionals such as music therapists to act as supportive partners in complex creative and therapeutic processes, showing a trend towards developing AI not just as an output tool, but as an integrated component designed to augment human skills.


Regulatory changes and ethical considerations

As AI generated music becomes more capable and widespread, legal and ethical frameworks worldwide are expected to continue adapting. Current policy discussions have been focusing on
copyright A copyright is a type of intellectual property that gives its owner the exclusive legal right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, ...
ownership, the use of AI to mimic artists (
deepfake ''Deepfakes'' (a portmanteau of and ) are images, videos, or audio that have been edited or generated using artificial intelligence, AI-based tools or AV editing software. They may depict real or fictional people and are considered a form of ...
s), and fair compensation for artists. Recent legislative efforts and debates, such as those concerning AI safety and regulation in places like California, show the challenges involved in balancing innovation with potential risks and societal impacts. Tracking these developments is crucial for understanding the future of AI in the music industry.Hight, Jewly (25 April 2024). "AI music isn't going away. Here are 4 big questions about what's next". ''NPR''. Retrieved March 30, 2025. https://www.npr.org/2024/04/25/1246928162/generative-ai-music-law-technology


See also

*
Algorithmic composition Algorithmic composition is the technique of using algorithms to create music. Algorithms (or, at the very least, formal sets of rules) have been used to compose music for centuries; the procedures used to plot voice-leading in Western counterpo ...
*
Automatic content recognition Automatic content recognition (ACR) is a technology used to identify content played on a media device or presented within a media file. Devices with ACR can allow for the collection of content consumption information automatically at the screen or ...
* Computational models of musical creativity *
Generative artificial intelligence Generative artificial intelligence (Generative AI, GenAI, or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models Machine learning, learn the underlyin ...
*
Generative music Generative music is a term popularized by Brian Eno to describe music that is ever-different and changing, and that is created by a system. Historical background In 1995 whilst working with SSEYO's Koan_(program), Koan software (built by Tim ...
*
List of music software This is a list of software for creating, performing, learning, analyzing, researching, broadcasting and editing music. This article only includes software, not services. For streaming services such as iHeartRadio, Pandora (service), Pandora, Prime ...
*
Music information retrieval Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. Those involved in MIR may have a background in academic musicology, psychoacoustics, psychology, signal processing, informatics, machine lear ...
*


References


Further reading


Understanding Music with AI: Perspectives on Music Cognition
. Edited by Mira Balaban, Kemal Ebcioglu, and Otto Laske. AAAI Press.
Proceedings of a Workshop held as part of AI-ED 93
World Conference on Artificial Intelligence in Education on Music Education: An Artificial Intelligence Approach *
Artificial Intelligence - Intelligent Art? Human-Machine Interaction and Creative Practice.
(Digital Society - Digitale Gesellschaft). Edited by Voigts, Eckart, Robin Auer, Dietmar Elflein, Sebastian Kunas, Jan Röhnert, Christoph Seelinger Bielefeld: transcript,2024.


External links


The Music Informatics Research Group
* ttps://web.archive.org/web/20190330070355/http://www.leeds.ac.uk/icsrim/ Interdisciplinary Centre for Research in Musicbr>Mixdevil - Is AI Good Gor Music ProducersOpenDreamAI full track generation
{{DEFAULTSORT:Music And Artificial Intelligence Artificial intelligence art Cognitive musicology Computer music