Artificial Intelligence Visual Art
   HOME

TheInfoList



OR:

Artificial intelligence visual art means visual artwork generated (or enhanced) through the use of
artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
(AI) programs. Artists began to create AI art in the mid to late 20th century, when the discipline was founded. Throughout its history, AI has raised many
philosophical concerns Philosophy ('love of wisdom' in Ancient Greek) is a systematic study of general and fundamental questions concerning topics like existence, reason, knowledge, value, mind, and language. It is a rational and critical inquiry that reflects on ...
related to the
human mind The mind is that which thinks, feels, perceives, imagines, remembers, and wills. It covers the totality of mental phenomena, including both conscious processes, through which an individual is aware of external and internal circumstances, ...
, artificial beings, and also what can be considered ''art'' in human–AI collaboration. Since the 20th century, people have used AI to create art, some of which has been exhibited in museums and won awards. During the
AI boom The AI boom is an ongoing period of rapid Progress in artificial intelligence, progress in the field of artificial intelligence (AI) that started in the late 2010s before gaining international prominence in the early 2020s. Examples include lar ...
of the 2020s,
text-to-image model A text-to-image model is a machine learning model which takes an input natural language prompt and produces an image matching that description. Text-to-image models began to be developed in the mid-2010s during the beginnings of the AI boom ...
s such as
Midjourney Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco-based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called '' prompts'', ...
,
DALL-E DALL-E, DALL-E 2, and DALL-E 3 (stylised DALL·E) are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as Prompt engineering, ''prompts''. The first ...
,
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...
, and
FLUX.1 Flux (also known as FLUX.1) is a text-to-image model developed by Black Forest Labs, based in Freiburg im Breisgau, Germany. Black Forest Labs was founded by former employees of Stability AI. As with other text-to-image models, Flux generates ...
became widely available to the public, allowing users to quickly generate imagery with little effort. Commentary about AI art in the 2020s has often focused on issues related to
copyright A copyright is a type of intellectual property that gives its owner the exclusive legal right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, ...
,
deception Deception is the act of convincing of one or many recipients of untrue information. The person creating the deception knows it to be false while the receiver of the information does not. It is often done for personal gain or advantage. Tort of ...
,
defamation Defamation is a communication that injures a third party's reputation and causes a legally redressable injury. The precise legal definition of defamation varies from country to country. It is not necessarily restricted to making assertions ...
, and its impact on more traditional artists, including
technological unemployment The term technological unemployment is used to describe the loss of jobs caused by technological change. It is a key type of structural unemployment. Technological change typically includes the introduction of labour-saving "mechanical-muscle" ...
.


History


Early history

Automated art dates back at least to the
automata An automaton (; : automata or automatons) is a relatively self-operating machine, or control mechanism designed to automatically follow a sequence of operations, or respond to predetermined instructions. Some automata, such as bellstrikers i ...
of
ancient Greek civilization Ancient Greece () was a northeastern Mediterranean civilization, existing from the Greek Dark Ages of the 12th–9th centuries BC to the end of classical antiquity (), that comprised a loose collection of culturally and linguistically rel ...
, when inventors such as
Daedalus In Greek mythology, Daedalus (, ; Greek language, Greek: Δαίδαλος; Latin language, Latin: ''Daedalus''; Etruscan language, Etruscan: ''Taitale'') was a skillful architect and craftsman, seen as a symbol of wisdom, knowledge and power. H ...
and
Hero of Alexandria Hero of Alexandria (; , , also known as Heron of Alexandria ; probably 1st or 2nd century AD) was a Greek mathematician and engineer who was active in Alexandria in Egypt during the Roman era. He has been described as the greatest experimental ...
were described as designing machines capable of writing text, generating sounds, and playing music. Creative automatons have flourished throughout history, such as
Maillardet's automaton Maillardet's automaton (or Draughtsman-Writer, Maelzel's Juvenile Artist, Juvenile Artist) is an automaton built in London c. 1800 by a Swiss mechanician, Henri Maillardet. It is currently part of the collections at The Franklin Institute in Ph ...
, created around 1800 and capable of creating multiple drawings and poems. Also in the 19th century,
Ada Lovelace Augusta Ada King, Countess of Lovelace (''née'' Byron; 10 December 1815 – 27 November 1852), also known as Ada Lovelace, was an English mathematician and writer chiefly known for her work on Charles Babbage's proposed mechanical general-pur ...
, writes that "computing operations" could be used to generate music and poems, now referred to as "The Lovelace Effect," where a computer's behavior is viewed as creative. Lovelace also discusses a concept known as "The Lovelace Objection," where she argues that a machine has "no pretensions whatever to originate anything." In 1950, with the publication of
Alan Turing Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretical computer ...
's paper "
Computing Machinery and Intelligence "Computing Machinery and Intelligence" is a seminal paper written by Alan Turing on the topic of artificial intelligence. The paper, published in 1950 in ''Mind (journal), Mind'', was the first to introduce his concept of what is now known as th ...
", there was a shift from defining machine intelligence in abstract terms to evaluating whether a machine can mimic human behavior and responses convincingly. Shortly after, the academic discipline of artificial intelligence was founded at a research
workshop Beginning with the Industrial Revolution era, a workshop may be a room, rooms or building which provides both the area and tools (or machinery) that may be required for the manufacture or repair of manufactured goods. Workshops were the only ...
at
Dartmouth College Dartmouth College ( ) is a Private university, private Ivy League research university in Hanover, New Hampshire, United States. Established in 1769 by Eleazar Wheelock, Dartmouth is one of the nine colonial colleges chartered before the America ...
in 1956. Since its founding, researchers in the field have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by
myth Myth is a genre of folklore consisting primarily of narratives that play a fundamental role in a society. For scholars, this is very different from the vernacular usage of the term "myth" that refers to a belief that is not true. Instead, the ...
,
fiction Fiction is any creative work, chiefly any narrative work, portraying character (arts), individuals, events, or setting (narrative), places that are imagination, imaginary or in ways that are imaginary. Fictional portrayals are thus inconsistent ...
, and
philosophy Philosophy ('love of wisdom' in Ancient Greek) is a systematic study of general and fundamental questions concerning topics like existence, reason, knowledge, Value (ethics and social sciences), value, mind, and language. It is a rational an ...
since antiquity.


Artistic history

Since the founding of AI in the 1950s, artists have used artificial intelligence to create artistic works. These works were sometimes referred to as
algorithmic art Algorithmic art or algorithm art is art, mostly visual art, in which the design is generated by an algorithm. Algorithmic artists are sometimes called algorists. Algorithmic art is created in the form of digital paintings and sculptures, int ...
,
computer art Computer art is art in which computers play a role in the production or display of the artwork. Such art can be an image, sound, animation, video, CD-ROM, DVD-ROM, video game, website, algorithm, performance or gallery installation. Many traditio ...
,
digital art Digital art, or the digital arts, is artistic work that uses Digital electronics, digital technology as part of the creative or presentational process. It can also refer to computational art that uses and engages with digital media. Since the 1960 ...
, or
new media art New media art includes artworks designed and produced by means of new media, electronic media technologies. It comprises virtual art, computer graphics, computer animation, digital art, interactive art, sound art, Internet art, video games, robo ...
. One of the first significant AI art systems is
AARON According to the Old Testament of the Bible, Aaron ( or ) was an Israelite prophet, a high priest, and the elder brother of Moses. Information about Aaron comes exclusively from religious texts, such as the Hebrew Bible, the New Testament ...
, developed by Harold Cohen beginning in the late 1960s at the
University of California The University of California (UC) is a public university, public Land-grant university, land-grant research university, research university system in the U.S. state of California. Headquartered in Oakland, California, Oakland, the system is co ...
at San Diego. AARON uses a symbolic rule-based approach to generate technical images in the era of
GOFAI In the philosophy of artificial intelligence, GOFAI ("Good old fashioned artificial intelligence") is classical symbolic AI, as opposed to other approaches, such as neural networks, situated robotics, narrow symbolic AI or neuro-symbolic AI. Th ...
programming, and it was developed by Cohen with the goal of being able to code the act of drawing. AARON was exhibited in 1972 at the
Los Angeles County Museum of Art The Los Angeles County Museum of Art (LACMA) is an art museum located on Wilshire Boulevard in the Miracle Mile vicinity of Los Angeles. LACMA is on Museum Row, adjacent to the La Brea Tar Pits (George C. Page Museum). LACMA was founded in 1961 ...
. From 1973 to 1975, Cohen refined AARON during a residency at the Artificial Intelligence Laboratory at
Stanford University Leland Stanford Junior University, commonly referred to as Stanford University, is a Private university, private research university in Stanford, California, United States. It was founded in 1885 by railroad magnate Leland Stanford (the eighth ...
. In 2024, the
Whitney Museum of American Art The Whitney Museum of American Art, known informally as "The Whitney", is a Modern art, modern and Contemporary art, contemporary American art museum located in the Meatpacking District, Manhattan, Meatpacking District and West Village neighbor ...
exhibited AI art from throughout Cohen's career, including re-created versions of his early robotic drawing machines. Karl Sims has exhibited art created with
artificial life Artificial life (ALife or A-Life) is a field of study wherein researchers examine systems related to natural life, its processes, and its evolution, through the use of simulations with computer models, robotics, and biochemistry. The discipline ...
since the 1980s. He received an M.S. in computer graphics from the
MIT Media Lab The MIT Media Lab is a research laboratory at the Massachusetts Institute of Technology, growing out of MIT's Architecture Machine Group in the MIT School of Architecture and Planning, School of Architecture. Its research does not restrict to fi ...
in 1987 and was artist-in-residence from 1990 to 1996 at the
supercomputer A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...
manufacturer and artificial intelligence company
Thinking Machines Thinking Machines Corporation was a supercomputer manufacturer and artificial intelligence (AI) company, founded in Waltham, Massachusetts, in 1983 by Sheryl Handler and W. Daniel "Danny" Hillis to turn Hillis's doctoral work at the Massachuse ...
. In both 1991 and 1992, Sims won the Golden Nica award at
Prix Ars Electronica The Prix Ars Electronica is one of the best known and longest running yearly prizes in the field of electronic and interactive art, computer animation, digital culture and music. It has been awarded since 1987 by Ars Electronica (Linz, Austria ...
for his videos using artificial evolution. In 1997, Sims created the interactive artificial evolution installation ''Galápagos'' for the
NTT InterCommunication Center NTT InterCommunication Center (ICC) is a media art gallery in Tokyo Opera City Tower in Shinjuku, Tokyo, Japan Tokyo, officially the Tokyo Metropolis, is the capital of Japan, capital and List of cities in Japan, most populous city in Jap ...
in Tokyo. Sims received an
Emmy Award The Emmy Awards, or Emmys, are an extensive range of awards for artistic and technical merit for the television industry. A number of annual Emmy Award ceremonies are held throughout the year, each with their own set of rules and award categor ...
in 2019 for outstanding achievement in engineering development. In 1999,
Scott Draves Scott Draves is an American digital artist. He is the inventor of fractal flames and the leader of the distributed computing project Electric Sheep. He also invented patch-based texture synthesis and published the first implementation of this ...
and a team of several engineers created and released ''
Electric Sheep Electric Sheep is a volunteer computing project for animating and evolving fractal flames, which are in turn distributed to the networked computers, which display them as a screensaver. Process The process is transparent to the casual user, ...
'' as a
free software Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
screensaver. ''Electric Sheep'' is a volunteer computing project for animating and evolving fractal flames, which are distributed to networked computers which display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize for ''Electric Sheep''. In 2014, Stephanie Dinkins began working on ''Conversations with Bina48''. For the series, Dinkins recorded her conversations with BINA48, a social robot that resembles a middle-aged black woman. In 2019, Dinkins won the
Creative Capital Creative Capital is a 501(c)3 nonprofit organization based in New York City that supports artists across the United States through funding, counsel, gatherings, and career development services. Since its founding in 1999, Creative Capital has co ...
award for her creation of an evolving artificial intelligence based on the "interests and culture(s) of people of color." In 2015, Sougwen Chung began ''Mimicry (Drawing Operations Unit: Generation 1)'', an ongoing collaboration between the artist and a robotic arm. In 2019, Chung won the Lumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. In 2018, an auction sale of artificial intelligence art was held at
Christie's Christie's is a British auction house founded in 1766 by James Christie (auctioneer), James Christie. Its main premises are on King Street, St James's in London, and it has additional salerooms in New York, Paris, Hong Kong, Milan, Geneva, Shan ...
in New York where the AI artwork ''
Edmond de Belamy ''Edmond de Belamy'', sometimes referred to as ''Portrait of Edmond de Belamy'', is a generative adversarial network (GAN) portrait painting constructed by Paris-based arts collective Obvious in 2018 from WikiArt artwork database. Printed on canv ...
'' sold for , which was almost 45 times higher than its estimate of –10,000. The artwork was created by Obvious, a Paris-based collective. In 2024, Japanese film ''generAIdoscope'' was released. The film was co-directed by Hirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence. In 2025, Japanese
anime is a Traditional animation, hand-drawn and computer animation, computer-generated animation originating from Japan. Outside Japan and in English, ''anime'' refers specifically to animation produced in Japan. However, , in Japan and in Ja ...
television series '' Twins Hinahima'' was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software.


Technical history

Deep learning Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s and causing a significant shift in the world of AI art. During the deep learning era, there are mainly these types of designs for generative art:
autoregressive model In statistics, econometrics, and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it can be used to describe certain time-varying processes in nature, economics, behavior, etc. The autoregre ...
s,
diffusion model In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable model, latent variable generative model, generative models. A diffusion model consists of two ...
s, GANs, normalizing flows. In 2014, Ian Goodfellow and colleagues at
Université de Montréal The Université de Montréal (; UdeM; ) is a French-language public research university in Montreal, Quebec, Canada. The university's main campus is located in the Côte-des-Neiges neighborhood of Côte-des-Neiges–Notre-Dame-de-Grâce on M ...
developed the
generative adversarial network A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...
(GAN), a type of
deep neural network Deep learning is a subset of machine learning that focuses on utilizing multilayered neural network (machine learning), neural networks to perform tasks such as Statistical classification, classification, Regression analysis, regression, and re ...
capable of learning to mimic the
statistical distribution In statistics, an empirical distribution function ( an empirical cumulative distribution function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step functio ...
of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful. Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specific
aesthetic Aesthetics (also spelled esthetics) is the branch of philosophy concerned with the nature of beauty and taste, which in a broad sense incorporates the philosophy of art.Slater, B. H.Aesthetics ''Internet Encyclopedia of Philosophy,'' , acces ...
by analyzing a
dataset A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record o ...
of example images. In 2015, a team at
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
released DeepDream, a program that uses a
convolutional neural network A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
to find and enhance patterns in images via algorithmic
pareidolia Pareidolia (; ) is the tendency for perception to impose a meaningful interpretation on a nebulous stimulus (physiology), stimulus, usually visual, so that one detects an object, pattern, or meaning where there is none. Pareidolia is a specific bu ...
. The process creates deliberately over-processed images with a dream-like appearance reminiscent of a
psychedelic experience A psychedelic experience (known colloquially as a trip) is a temporary altered state of consciousness induced by the consumption of a psychedelic substance (most commonly Lysergic acid diethylamide, LSD, mescaline, psilocybin mushrooms, or N,N- ...
. Later, in 2017, a conditional GAN learned to generate 1000 image classes of
ImageNet The ImageNet project is a large visual database designed for use in Outline of object recognition, visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictur ...
, a large visual
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
designed for use in visual object recognition software research. By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models.
Autoregressive model In statistics, econometrics, and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it can be used to describe certain time-varying processes in nature, economics, behavior, etc. The autoregre ...
s were used for image generation, such as PixelRNN (2016), which autoregressively generates one pixel after another with a
recurrent neural network Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...
. Immediately after the
Transformer In electrical engineering, a transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple Electrical network, circuits. A varying current in any coil of the transformer produces ...
architecture was proposed in ''
Attention Is All You Need "Attention Is All You Need" is a 2017 landmark research paper in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based on the attention mechanism p ...
'' (2018), it was used for autoregressive generation of images, but without text conditioning. The website Artbreeder, launched in 2018, uses the models StyleGAN and BigGAN to allow users to generate and modify images such as faces, landscapes, and paintings. In the 2020s,
text-to-image model A text-to-image model is a machine learning model which takes an input natural language prompt and produces an image matching that description. Text-to-image models began to be developed in the mid-2010s during the beginnings of the AI boom ...
s, which generate images based on prompts, became widely used, marking yet another shift in the creation of AI generated artworks. In 2021, using the influential large language
generative pre-trained transformer A generative pre-trained transformer (GPT) is a type of large language model (LLM) and a prominent framework for generative artificial intelligence. It is an Neural network (machine learning), artificial neural network that is used in natural ...
models that are used in
GPT-2 Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of Generative pre-trained transformer, GPT models. GPT-2 was pre-trained on a dataset of 8 million web pages. It was par ...
and
GPT-3 Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of deep neural network, which supersedes recurrence and convolution-based ...
,
OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
released a series of images created with the text-to-image AI model DALL-E 1. It was an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021,
EleutherAI EleutherAI () is a grass-roots non-profit artificial intelligence (AI) research group. The group, considered an open-source version of OpenAI, was formed in a Discord server in July 2020 by Connor Leahy, Sid Black, and Leo Gao to organize a rep ...
released the
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
VQGAN-CLIP based on OpenAI's CLIP model.
Diffusion model In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable model, latent variable generative model, generative models. A diffusion model consists of two ...
s, generative models used to create synthetic data based on existing data, were first proposed in 2015, but they only became better than GANs in early 2021.
Latent diffusion model The Latent Diffusion Model (LDM) is a diffusion model architecture developed by the CompVis (Computer Vision & Learning) group at LMU Munich. Introduced in 2015, diffusion models (DMs) are trained with the objective of removing successive appli ...
was published in December 2021 and became the basis for the later
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...
(August 2022). In 2022,
Midjourney Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco-based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called '' prompts'', ...
was released, followed by
Google Brain Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence ...
's Imagen and Parti, which were announced in May 2022,
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
's NUWA-Infinity, and the
source-available Source-available software is software released through a source code distribution model that includes arrangements where the source can be viewed, and in some cases modified, but without necessarily meeting the criteria to be called ''open-source ...
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...
, which was released in August 2022. DALL-E2, a successor to DALL-E, was beta-tested and released (with the further successor DALL-E3 being released in 2023). Stability AI has a Stable Diffusion web interface called DreamStudio, plugins for
Krita Krita ( ) is a free and open-source software, free and open-source raster graphics editor designed primarily for digital art and 2D animation. Originally created for Linux, the software also runs on Windows, macOS, Haiku (operating system), Hai ...
,
Photoshop Adobe Photoshop is a raster graphics editor developed and published by Adobe for Windows and macOS. It was created in 1987 by Thomas and John Knoll. It is the most used tool for professional digital art, especially in raster graphics editin ...
,
Blender A blender (sometimes called a mixer (from Latin ''mixus, the PPP of miscere eng. to Mix)'' or liquidiser in British English) is a kitchen and laboratory appliance used to mix, crush, purée or emulsify food and other substances. A stationary ...
, and
GIMP Gimp or GIMP may refer to: Clothing * Bondage suit, also called a gimp suit, a type of suit used in BDSM * Bondage mask, also called a gimp mask, often worn in conjunction with a gimp suit Embroidery and crafts * Gimp (thread), an ornamental tr ...
, and the Automatic1111 web-based open source
user interface In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine fro ...
. Stable Diffusion's main pre-trained model is shared on the Hugging Face Hub.
Ideogram An ideogram or ideograph (from Ancient Greek, Greek 'idea' + 'to write') is a symbol that is used within a given writing system to represent an idea or concept in a given language. (Ideograms are contrasted with phonogram (linguistics), phono ...
was released in August 2023, this model is known for its ability to generate legible text. In 2024,
Flux Flux describes any effect that appears to pass or travel (whether it actually moves or not) through a surface or substance. Flux is a concept in applied mathematics and vector calculus which has many applications in physics. For transport phe ...
was released. This model can generate realistic images and was integrated into
Grok ''Grok'' () is a neologism coined by the American writer Robert A. Heinlein for his 1961 science fiction novel '' Stranger in a Strange Land''. While the ''Oxford English Dictionary'' summarizes the meaning of ''grok'' as "to understand intuit ...
, the chatbot used on
X (formerly Twitter) Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, imag ...
, and ''Le Chat'', the chatbot of
Mistral AI Mistral AI SAS () is a French artificial intelligence (AI) startup, headquartered in Paris. Founded in 2023, it specializes in open-weight large language models (LLMs), with both open-source and proprietary AI models. Namesake The company is ...
. Flux was developed by Black Forest Labs, founded by the researchers behind Stable Diffusion. Grok later switched to its own text-to-image model
Aurora An aurora ( aurorae or auroras), also commonly known as the northern lights (aurora borealis) or southern lights (aurora australis), is a natural light display in Earth's sky, predominantly observed in high-latitude regions (around the Arc ...
in December of the same year. Several companies, along with their products, have also developed an AI model integrated with an image editing service.
Adobe Adobe (from arabic: الطوب Attub ; ) is a building material made from earth and organic materials. is Spanish for mudbrick. In some English-speaking regions of Spanish heritage, such as the Southwestern United States, the term is use ...
has released and integrated the AI model
Firefly The Lampyridae are a family of elateroid beetles with more than 2,000 described species, many of which are light-emitting. They are soft-bodied beetles commonly called fireflies, lightning bugs, or glowworms for their conspicuous production ...
into
Premiere Pro Adobe Premiere Pro is a video editing application developed by Adobe Inc. and is distributed as part of the Adobe Creative Cloud suite. It is primarily used for producing high-quality videos across various industries. History Original Ado ...
,
Photoshop Adobe Photoshop is a raster graphics editor developed and published by Adobe for Windows and macOS. It was created in 1987 by Thomas and John Knoll. It is the most used tool for professional digital art, especially in raster graphics editin ...
, and
Illustrator An illustrator is an artist who specializes in enhancing writing or elucidating concepts by providing a visual representation that corresponds to the content of the associated text or idea. The illustration may be intended to clarify complicate ...
. Microsoft has also publicly announced AI image-generator features for
Microsoft Paint Microsoft Paint (commonly known as MS Paint or simply Paint) is a simple raster graphics editor that has been included with all versions of Microsoft Windows. The program opens, modifies and saves image files in Windows bitmap (BMP), JPEG, GI ...
. Along with this, some examples of text-to-video models of the mid-2020s are
Runway In aviation, a runway is an elongated, rectangular surface designed for the landing and takeoff of an aircraft. Runways may be a human-made surface (often asphalt concrete, asphalt, concrete, or a mixture of both) or a natural surface (sod, ...
's Gen-2, Google's VideoPoet, and OpenAI's Sora, which was released in December 2024. In 2025, several models were released. GPT Image 1 from
OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
, launched in March 2025, introduced new text rendering and multimodal capabilities, enabling image generation from diverse inputs like sketches and text. MidJourney v7 debuted in April 2025, providing improved text prompt processing. In May 2025
Flux.1 Flux (also known as FLUX.1) is a text-to-image model developed by Black Forest Labs, based in Freiburg im Breisgau, Germany. Black Forest Labs was founded by former employees of Stability AI. As with other text-to-image models, Flux generates ...
Kontext by Black Forest Labs emerged as an efficient model for high-fidelity image generation, while Google’s Imagen 4 was released with improved photorealism.


Tools and processes


Approaches

There are many approaches used by artists to develop AI visual art. When
text-to-image A text-to-image model is a machine learning model which takes an input natural language prompt and produces an image matching that description. Text-to-image models began to be developed in the mid-2010s during the beginnings of the AI boom ...
is used, AI generates images based on textual descriptions, using models like diffusion or transformer-based architectures. Users input prompts and the AI produces corresponding visuals. When image-to-image is used, AI transforms an input image into a new style or form based on a prompt or style reference, such as turning a sketch into a photorealistic image or applying an artistic style. When image-to-video is used, AI generates short video clips or animations from a single image or a sequence of images, often adding motion or transitions. This can include animating still portraits or creating dynamic scenes. When
text-to-video A text-to-video model is a machine learning model that uses a natural language description as input to produce a video relevant to the input text. Advancements during the 2020s in the generation of high-quality, text-conditioned videos have large ...
is used, AI creates videos directly from text prompts, producing animations, realistic scenes, or abstract visuals. This is an extension of text-to-image but focuses on temporal sequences.


Imagery

There are many tools available to the artist when working with diffusion models. They can define both positive and negative prompts, but they are also afforded a choice in using (or omitting the use of) VAEs, LoRAs, hypernetworks, IP-adapter, and embedding/textual inversions. Artists can tweak settings like guidance scale (which balances creativity and accuracy), seed (to control randomness), and upscalers (to enhance image resolution), among others. Additional influence can be exerted during pre-inference by means of noise manipulation, while traditional post-processing techniques are frequently used post-inference. People can also train their own models. In addition, procedural "rule-based" generation of images using mathematical patterns, algorithms that simulate brush strokes and other painted effects, and deep learning algorithms such as generative adversarial networks (GANs) and transformers have been developed. Several companies have released apps and websites that allow one to forego all the options mentioned entirely while solely focusing on the positive prompt. There also exist programs which transform photos into art-like images in the style of well-known sets of paintings. There are many options, ranging from simple consumer-facing mobile apps to
Jupyter Project Jupyter (pronounced "Jupiter") is a project to develop open-source software, open standards, and services for interactive computing across multiple programming languages. It was spun off from IPython in 2014 by Fernando Pérez and Brian ...
notebooks and web UIs that require powerful GPUs to run effectively. Additional functionalities include "textual inversion," which refers to enabling the use of user-provided concepts (like an object or a style) learned from a few images. Novel art can then be generated from the associated word(s) (the text that has been assigned to the learned, often abstract, concept) and model extensions or fine-tuning (such as DreamBooth).


Impact and applications

AI has the potential for a
societal transformation In sociology, societal transformation refers to “a deep and sustained, nonlinear systemic change” in a society. Transformational changes can occur within a particular system, such as a city, a transport or energy system. Societal transformat ...
, which may include enabling the expansion of noncommercial niche genres (such as
cyberpunk derivatives Cyberpunk derivatives, variously also called literary punk genres, punk fiction, science fiction punk (sci-fi-punk) or punk-punk, are a collection of genres and subgenres in speculative fiction, science fiction, retrofuturism, aesthetics, and ...
like solarpunk) by amateurs, novel entertainment, fast prototyping, increasing art-making accessibility, and artistic output per effort or expenses or time—e.g., via generating drafts, draft-definitions, and image components (
inpainting Inpainting is a conservation process where damaged, deteriorated, or missing parts of an artwork are filled in to present a complete image. This process is commonly used in image restoration. It can be applied to both physical and digital art m ...
). Generated images are sometimes used as sketches, low-cost experiments, inspiration, or illustrations of
proof-of-concept A proof of concept (POC or PoC), also known as proof of principle, is an inchoate realization of a certain idea or method in order to demonstrate its feasibility or viability. A proof of concept is usually small and may or may not be complete ...
-stage ideas. Additional functionalities or improvements may also relate to post-generation manual editing (i.e., polishing), such as subsequent tweaking with an image editor.


Prompt engineering and sharing

Prompts for some text-to-image models can also include images and keywords and configurable parameters, such as artistic style, which is often used via keyphrases like "in the style of ame of an artist in the prompt /or selection of a broad aesthetic/art style. There are platforms for sharing, trading, searching, forking/refining, or collaborating on prompts for generating specific imagery from image generators. Prompts are often shared along with images on image-sharing websites such as
Reddit Reddit ( ) is an American Proprietary software, proprietary social news news aggregator, aggregation and Internet forum, forum Social media, social media platform. Registered users (commonly referred to as "redditors") submit content to the ...
and AI art-dedicated websites. A prompt is not the complete input needed for the generation of an image; additional inputs that determine the generated image include the output resolution,
random seed A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator. A pseudorandom number generator's number sequence is completely determined by the seed: thus, if a pseudorandom number gener ...
, and random sampling parameters.


Related terminology

Synthetic media Synthetic media (also known as AI-generated media, media produced by generative AI, personalized media, personalized content, and colloquially as deepfakes) is a catch-all term for the artificial production, manipulation, and modification of dat ...
, which includes AI art, was described in 2022 as a major technology-driven trend that will affect business in the coming years.
Harvard Kennedy School The John F. Kennedy School of Government, commonly referred to as Harvard Kennedy School (HKS), is the school of public policy of Harvard University, a private university in Cambridge, Massachusetts. Harvard Kennedy School offers master's de ...
researchers voiced concerns about synthetic media serving as a vector for political misinformation soon after studying the proliferation of AI art on the X platform. ''Synthography'' is a proposed term for the practice of generating images that are similar to photographs using AI.


Impact


Bias

A major concern raised about AI-generated images and art is
sampling bias In statistics, sampling bias is a bias (statistics), bias in which a sample is collected in such a way that some members of the intended statistical population, population have a lower or higher sampling probability than others. It results in a b ...
within model training data leading towards discriminatory output from AI art models. In 2023,
University of Washington The University of Washington (UW and informally U-Dub or U Dub) is a public research university in Seattle, Washington, United States. Founded in 1861, the University of Washington is one of the oldest universities on the West Coast of the Uni ...
researchers found evidence of racial bias within the Stable Diffusion model, with images of a "person" corresponding most frequently with images of males from Europe or North America. Looking more into the
sampling bias In statistics, sampling bias is a bias (statistics), bias in which a sample is collected in such a way that some members of the intended statistical population, population have a lower or higher sampling probability than others. It results in a b ...
found within AI training data, in 2017, researchers at Princeton University used AI software to link over 2 million words, finding that European names were viewed as more "pleasant" than African-Americans names, and that the words "woman" and "girl" were more likely to be associated with the arts instead of science and math, "which were most likely connected to males." Generative AI models typically work based on user-entered word-based prompts, especially in the case of
diffusion model In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable model, latent variable generative model, generative models. A diffusion model consists of two ...
s, and this word-related bias may lead to biased results. Along with this, generative AI can perpetuate harmful stereotypes regarding women. For example, Lensa, an AI app that trended on
TikTok TikTok, known in mainland China and Hong Kong as Douyin (), is a social media and Short-form content, short-form online video platform owned by Chinese Internet company ByteDance. It hosts user-submitted videos, which may range in duration f ...
in 2023, was known to lighten black skin, make users thinner, and generate hypersexualized images of women. Melissa Heikkilä, a senior reporter at ''
MIT Technology Review ''MIT Technology Review'' is a bimonthly magazine wholly owned by the Massachusetts Institute of Technology. It was founded in 1899 as ''The Technology Review'', and was re-launched without "''The''" in its name on April 23, 1998, under then pu ...
'', shared the findings of an experiment using Lensa, noting that the generated avatars did not resemble her and often depicted her in a hypersexualized manner. Experts suggest that such outcomes can result from biases in the datasets used to train AI models, which can sometimes contain imbalanced representations, including hypersexual or nude imagery. In 2024, Google's
chatbot A chatbot (originally chatterbot) is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of main ...
Gemini Gemini most often refers to: * Gemini (constellation), one of the constellations of the zodiac * Gemini (astrology), an astrological sign Gemini may also refer to: Science and technology Space * Gemini in Chinese astronomy, the Gemini constellat ...
's AI image generator was criticized for perceived
racial bias Racism is the belief that groups of humans possess different behavioral traits corresponding to inherited attributes and can be divided based on the superiority of one Race (human categorization), race or ethnicity over another. It may also me ...
, with claims that Gemini deliberately underrepresented white people in its results. Users reported that it generated images of white historical figures like the
Founding Fathers The Founding Fathers of the United States, often simply referred to as the Founding Fathers or the Founders, were a group of late-18th-century American revolutionary leaders who united the Thirteen Colonies, oversaw the War of Independence ...
, Nazi soldiers, and
Vikings Vikings were seafaring people originally from Scandinavia (present-day Denmark, Norway, and Sweden), who from the late 8th to the late 11th centuries raided, pirated, traded, and settled throughout parts of Europe.Roesdahl, pp. 9� ...
as other races, and that it refused to process prompts such as "happy white people" and "ideal
nuclear family A nuclear family (also known as an elementary family, atomic family, or conjugal family) is a term for a family group consisting of parents and their children (one or more), typically living in one home residence. It is in contrast to a single ...
". Google later apologized for "missing the mark" and took Gemini's image generator offline for updates. This prompted discussions about the ethical implications of representing historical figures through a contemporary lens, leading critics to argue that these outputs could mislead audiences regarding actual historical contexts. In addition to the well-documented representational issues such as racial and gender bias, some scholars have also pointed out deeper conceptual assumptions that shape how we perceive AI-generated art. For instance, framing AI strictly as a passive tool overlooks how cultural and technological factors influence its outputs. Others suggest viewing AI as part of a collaborative creative process, where both human and machine contribute to the artistic result.


Copyright

Legal scholars, artists, and media corporations have considered the legal and ethical implications of artificial intelligence art since the 20th century. Some artists use AI art to critique and explore the ethics of using gathered data to produce new artwork. In 1985, intellectual property law professor
Pamela Samuelson Pamela Samuelson (born August 4, 1948) is an American legal scholar, activist, and philanthropist. She is the Richard M. Sherman '74 Distinguished Professor of Law at the University of California, Berkeley, School of Law, where she has been a mem ...
argued that US copyright should allocate algorithmically generated artworks to the user of the computer program. A 2019 ''
Florida Law Review The ''Florida Law Review'' is a bimonthly law review published by the University of Florida's Fredric G. Levin College of Law. The journal was established in 1948 as the ''University of Florida Law Review'' and it assumed its current name in 1989. ...
'' article presented three perspectives on the issue. In the first, artificial intelligence itself would become the copyright owner; to do this, Section 101 of the US Copyright Act would need to be amended to define "author" as a computer. In the second, following Samuelson's argument, the user, programmer, or artificial intelligence company would be the copyright owner. This would be an expansion of the "
work for hire In copyright law, a work made for hire (work for hire or WFH) is a work whose copyright is initially owned by an entity other than the actual creator as a result of an employment relationship or, in some cases, a commission. It is an exception to t ...
" doctrine, under which ownership of a copyright is transferred to the "employer." In the third situation, copyright assignments would never take place, and such works would be in the
public domain The public domain (PD) consists of all the creative work to which no Exclusive exclusive intellectual property rights apply. Those rights may have expired, been forfeited, expressly Waiver, waived, or may be inapplicable. Because no one holds ...
, as copyright assignments require an act of authorship. In 2022, coinciding with the rising availability of consumer-grade AI image generation services, popular discussion renewed over the legality and ethics of AI-generated art. A particular topic is the inclusion of copyrighted artwork and images in AI training datasets, with artists objecting to commercial AI products using their works without consent, credit, or financial compensation. In September 2022, Reema Selhi, of the
Design and Artists Copyright Society The Design and Artists Copyright Society (DACS) is a British company limited by guarantee. It is a not-for-profit Copyright collective, collective management organisation organisation established in 1983 and in operation since 1984. It collect ...
, stated that "there are no safeguards for artists to be able to identify works in databases that are being used and opt out." Some have claimed that images generated with these models can bear resemblance to extant artwork, sometimes including the remains of the original artist's signature. In December 2022, users of the portfolio platform ArtStation staged an online protest against non-consensual use of their artwork within datasets; this resulted in opt-out services, such as "Have I Been Trained?" increasing in profile, as well as some online art platforms promising to offer their own opt-out options. According to the US Copyright Office, artificial intelligence programs are unable to hold copyright, a decision upheld at the Federal District level as of August 2023 followed the reasoning from the
monkey selfie copyright dispute Between 2011 and 2018, a series of disputes took place about the copyright status of selfies taken by Celebes crested macaques using equipment belonging to the British wildlife photographer David J. Slater. The disputes involved Wikimedia Comm ...
.
OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
, the developer of
DALL-E DALL-E, DALL-E 2, and DALL-E 3 (stylised DALL·E) are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as Prompt engineering, ''prompts''. The first ...
, has its own policy on who owns generated art. They assign the right and title of a generated image to the creator, meaning the user who inputted the prompt owns the image generated, along with the right to sell, reprint, and merchandise it. In January 2023, three artists— Sarah Andersen, Kelly McKernan, and Karla Ortiz—filed a
copyright infringement Copyright infringement (at times referred to as piracy) is the use of Copyright#Scope, works protected by copyright without permission for a usage where such permission is required, thereby infringing certain exclusive rights granted to the c ...
lawsuit against Stability AI,
Midjourney Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco-based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called '' prompts'', ...
, and
DeviantArt DeviantArt (formerly styled as deviantART and thus abbreviated as dA) is an American online community that features artwork, videography, photography, and literature, launched on August 7, 2000, by Mathew Stephens, Scott Jarkoff and Angelo Sotir ...
, claiming that it is legally required to obtain the consent of artists before training neural nets on their work and that these companies infringed on the rights of millions of artists by doing so on five billion images scraped from the web.James Vincent "AI art tools Stable Diffusion and Midjourney targeted with copyright lawsuit" The Verge, 16 January 2023.
/ref> In July 2023, U.S. District Judge William Orrick was inclined to dismiss most of the lawsuits filed by Andersen, McKernan, and Ortiz, but allowed them to file a new complaint. Also in 2023, Stability AI was sued by
Getty Images Getty Images Holdings, Inc. (stylized as gettyimages) is a visual media company and supplier of stock images, editorial photography, video, and music for business and consumers, with a library of over 477 million assets. It targets three mark ...
for using its images in the training data. A tool built by Simon Willison allowed people to search 0.5% of the training data for Stable Diffusion V1.1, i.e., 12 million of the 2.3 billion instances from LAION 2B. Artist Karen Hallion discovered that her copyrighted images were used as training data without their consent. In March 2024, Tennessee enacted the ELVIS Act, which prohibits the use of AI to mimic a musician's voice without permission. A month later in that year,
Adam Schiff Adam Bennett Schiff (born June 22, 1960) is an American lawyer, author, and politician serving as the Seniority in the United States Senate, junior United States Senate, United States senator from California, a seat he has held since 2024. A m ...
introduced the Generative AI Copyright Disclosure Act which, if passed, would require that AI companies to submit copyrighted works in their datasets to the
Register of Copyrights The Register of Copyrights is the director of the United States Copyright Office within the Library of Congress, as provided by . The Office has been headed by a Register since 1897. The Register is appointed by, and responsible to, the Librar ...
before releasing new generative AI systems. In November 2024, a group of artists and activists shared early access to OpenAI’s unreleased video generation model, Sora, via
Huggingface Hugging Face, Inc. is a French-American company based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing appli ...
. The action, accompanied by a statement, criticized the exploitative use of artists’ work by major corporations.' On June 11, 2025,
Universal Pictures Universal City Studios LLC, doing business as Universal Pictures (also known as Universal Studios or simply Universal), is an American filmmaking, film production and film distribution, distribution company headquartered at the 10 Universal Ci ...
(owned by
Comcast Comcast Corporation, formerly known as Comcast Holdings,Before the AT&T Broadband, AT&T merger in 2001, the parent company was Comcast Holdings Corporation. Comcast Holdings Corporation now refers to a subsidiary of Comcast Corporation, not th ...
) and
The Walt Disney Company The Walt Disney Company, commonly referred to as simply Disney, is an American multinational mass media and entertainment conglomerate headquartered at the Walt Disney Studios complex in Burbank, California. Disney was founded on October 16 ...
filed a copyright infringement lawsuit against Midjourney. The suit described Midjourney as "a bottomless pit of plagiarism."


Deception

As with other types of
photo manipulation Photograph manipulation involves the transformation or alteration of a photograph. Some photograph manipulations are considered to be skillful artwork, while others are considered to be unethical practices, especially when used to deceive. Mot ...
since the early 19th century, some people in the early 21st century have been concerned that AI could be used to create content that is misleading and can be made to damage a person's reputation, such as
deepfake ''Deepfakes'' (a portmanteau of and ) are images, videos, or audio that have been edited or generated using artificial intelligence, AI-based tools or AV editing software. They may depict real or fictional people and are considered a form of ...
s. Artist Sarah Andersen, who previously had her art copied and edited to depict
Neo-Nazi Neo-Nazism comprises the post–World War II militant, social, and political movements that seek to revive and reinstate Nazism, Nazi ideology. Neo-Nazis employ their ideology to promote hatred and Supremacism#Racial, racial supremacy (ofte ...
beliefs, stated that the spread of
hate speech Hate speech is a term with varied meaning and has no single, consistent definition. It is defined by the ''Cambridge Dictionary'' as "public speech that expresses hate or encourages violence towards a person or group based on something such as ...
online can be worsened by the use of image generators. Some also generate images or videos for the purpose of
catfishing Catfishing refers to the creation of a fictitious online persona, or fake identity (typically on social networking platforms), with the intent of deception, usually to mislead a victim into an online romantic relationship or to commit finan ...
. AI systems have the ability to create deepfake content, which is often viewed as harmful and offensive. The creation of deepfakes poses a risk to individuals who have not consented to it. This mainly refers to
deepfake pornography Deepfake pornography, or simply fake pornography, is a type of synthetic pornography that is created via altering already-existing photographs or video by applying deepfake technology to the images of the participants. The use of deepfake pornogr ...
which is used as
revenge porn Revenge porn is the distribution of sexually explicit images or videos of individuals without their consent, with the punitive intention to create public humiliation or character assassination out of revenge against the victim. The material ma ...
, where sexually explicit material is disseminated to humiliate or harm another person. AI-generated
child pornography Child pornography (also abbreviated as CP, also called child porn or kiddie porn, and child sexual abuse material, known by the acronym CSAM (underscoring that children can not be deemed willing participants under law)), is Eroticism, erotic ma ...
has been deemed a potential danger to society due to its unlawful nature. File:EldagsenElectrician.jpg, ''Pseudomnesia: The Electrician'' won one of the categories in the
Sony World Photography Awards The World Photography Organisation is a British company best known for its annual Sony World Photography Awards. The company was founded in 2007 by Scott Gray, and is now a subsidiary of Gray's art events company Creo. The World Photography Org ...
competition. File:Pope Francis in puffy winter jacket.jpg, A 2023 AI-generated image of
Pope Francis Pope Francis (born Jorge Mario Bergoglio; 17 December 1936 – 21 April 2025) was head of the Catholic Church and sovereign of the Vatican City State from 13 March 2013 until Death and funeral of Pope Francis, his death in 2025. He was the fi ...
wearing a puffy winter jacket fooled some viewers into believing it was an actual photograph. It went viral on social media platforms. File:Trump’s arrest (2).jpg, Journalist Eliot Higgins' Midjourney-generated image depicts former President
Donald Trump Donald John Trump (born June 14, 1946) is an American politician, media personality, and businessman who is the 47th president of the United States. A member of the Republican Party (United States), Republican Party, he served as the 45 ...
getting arrested. The image was posted on
Twitter Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
and went viral. File:AI generated figure published in a Frontiers journal.png, One of the seven AI-generated images that were used for figures in the now-retracted paper ''Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway''. Figure 1, "Spermatogonial stem cells, isolated, purified and cultured from rat testes".
After winning the 2023 "Creative" "Open competition" Sony World Photography Awards, Boris Eldagsen stated that his entry was actually created with artificial intelligence. Photographer Feroz Khan commented to the
BBC The British Broadcasting Corporation (BBC) is a British public service broadcaster headquartered at Broadcasting House in London, England. Originally established in 1922 as the British Broadcasting Company, it evolved into its current sta ...
that Eldagsen had "clearly shown that even experienced photographers and art experts can be fooled". Smaller contests have been affected as well; in 2023, a contest run by author Mark Lawrence as Self-Published Fantasy Blog-Off was cancelled after the winning entry was allegedly exposed to be a collage of images generated with Midjourney. In May 2023, on social media sites such as Reddit and Twitter, attention was given to a Midjourney-generated image of
Pope Francis Pope Francis (born Jorge Mario Bergoglio; 17 December 1936 – 21 April 2025) was head of the Catholic Church and sovereign of the Vatican City State from 13 March 2013 until Death and funeral of Pope Francis, his death in 2025. He was the fi ...
wearing a white puffer coat. Additionally, an AI-generated image of an attack on the
Pentagon In geometry, a pentagon () is any five-sided polygon or 5-gon. The sum of the internal angles in a simple polygon, simple pentagon is 540°. A pentagon may be simple or list of self-intersecting polygons, self-intersecting. A self-intersecting ...
went viral as part of a hoax news story on Twitter. In the days before March 2023 indictment of Donald Trump as part of the
Stormy Daniels–Donald Trump scandal An alleged one-night sexual encounter took place in 2006 between businessman and later U.S. president Donald Trump and pornographic film actress Stormy Daniels, followed by a conspiracy on the part of Trump to cover up the story in the month p ...
, several AI-generated images allegedly depicting Trump's arrest went viral online. On March 20, British journalist Eliot Higgins generated various images of Donald Trump being arrested or imprisoned using Midjourney v5 and posted them on Twitter; two images of Trump struggling against arresting officers went viral under the mistaken impression that they were genuine, accruing more than 5 million views in three days. According to Higgins, the images were not meant to mislead, but he was banned from using Midjourney services as a result. As of April 2024, the tweet had garnered more than 6.8 million views. In February 2024, the paper ''Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway'' was published using AI-generated images. It was later retracted from '' Frontiers in Cell and Developmental Biology'' because the paper "does not meet the standards". To mitigate some deceptions, OpenAI developed a tool in 2024 to detect images that were generated by DALL-E 3. In testing, this tool accurately identified DALL-E 3-generated images approximately 98% of the time. The tool is also fairly capable of recognizing images that have been visually modified by users post-generation.


Income and employment stability

As generative AI image software such as
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...
and
DALL-E DALL-E, DALL-E 2, and DALL-E 3 (stylised DALL·E) are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as Prompt engineering, ''prompts''. The first ...
continue to advance, the potential problems and concerns that these systems pose for creativity and artistry have risen. In 2022, artists working in various media raised concerns about the impact that
generative artificial intelligence Generative artificial intelligence (Generative AI, GenAI, or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models Machine learning, learn the underlyin ...
could have on their ability to earn money, particularly if AI-based images started replacing artists working in the illustration and design industries. In August 2022, digital artist R. J. Palmer stated that "I could easily envision a scenario where using AI, a single artist or art director could take the place of 5–10 entry level artists... I have seen a lot of self-published authors and such say how great it will be that they don’t have to hire an artist." Scholars Jiang et al. state that "Leaders of companies like Open AI and Stability AI have openly stated that they expect generative AI systems to replace creatives imminently." A 2022 case study found that AI-produced images created by technology like
DALL-E DALL-E, DALL-E 2, and DALL-E 3 (stylised DALL·E) are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as Prompt engineering, ''prompts''. The first ...
caused some traditional artists to be concerned about losing work, while others use it to their advantage and view it as a tool. AI-based images have become more commonplace in art markets and search engines because AI-based text-to-image systems are trained from pre-existing artistic images, sometimes without the original artist's consent, allowing the software to mimic specific artists' styles. For example, Polish digital artist Greg Rutkowski has stated that it is more difficult to search for his work online because many of the images in the results are AI-generated specifically to mimic his style. Furthermore, some training databases on which AI systems are based are not accessible to the public. The ability of AI-based art software to mimic or forge artistic style also raises concerns of malice or greed. Works of AI-generated art, such as '' Théâtre D'opéra Spatial'', a text-to-image AI illustration that won the grand prize in the August 2022 digital art competition at the
Colorado State Fair The Colorado State Fair is an event held annually in late August in Pueblo, Colorado. The state fair has been a tradition since October 9, 1872. The fairgrounds also host a number of other events during the rest of the year. Organizationally, ...
, have begun to overwhelm art contests and other submission forums meant for small artists. The
Netflix Netflix is an American subscription video on-demand over-the-top streaming service. The service primarily distributes original and acquired films and television shows from various genres, and it is available internationally in multiple lang ...
short film '' The Dog & the Boy'', released in January 2023, received backlash online for its use of artificial intelligence art to create the film's background artwork. Within the same vein,
Disney The Walt Disney Company, commonly referred to as simply Disney, is an American multinational mass media and entertainment industry, entertainment conglomerate (company), conglomerate headquartered at the Walt Disney Studios (Burbank), Walt Di ...
released ''
Secret Invasion "Secret Invasion" is a comic book fictional crossover, crossover storyline written by Brian Michael Bendis and illustrated by Leinil Francis Yu, that ran through a self-titled eight-issue Limited series (comics), limited series and several ti ...
'', a
Marvel Marvel may refer to: Business * Marvel Entertainment, an American entertainment company ** Marvel Comics, the primary imprint of Marvel Entertainment ** Marvel Universe, a fictional shared universe ** Marvel Music, an imprint of Marvel Comics ...
TV show with an AI-generated intro, on Disney+ in 2023, causing concern and backlash regarding the idea that artists could be made obsolete by machine-learning tools. AI art has sometimes been deemed to be able to replace traditional
stock images Stock photography is the supply of photographs that are often licensed for specific uses. The stock photo industry, which began to gain hold in the 1920s, has established models including traditional macrostock photography, midstock photography, ...
. In 2023,
Shutterstock Shutterstock, Inc. is an American provider of stock photography, stock footage, stock music, and editing tools; it is headquartered in New York. Founded in 2002 by programmer and photographer Jon Oringer, Shutterstock maintains a library of ar ...
announced a beta test of an AI tool that can regenerate partial content of other Shutterstock's images.
Getty Images Getty Images Holdings, Inc. (stylized as gettyimages) is a visual media company and supplier of stock images, editorial photography, video, and music for business and consumers, with a library of over 477 million assets. It targets three mark ...
and
Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
have partnered with the launch of Generative AI by iStock, a model trained on Getty's library and iStock's photo library using Nvidia's Picasso model.


Power usage

Researchers from
Hugging Face Hugging Face, Inc. is a French-American company based in List of tech companies in the New York metropolitan area, New York City that develops computation tools for building applications using machine learning. It is most notable for its Transf ...
and
Carnegie Mellon University Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania, United States. The institution was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools. In 1912, it became the Carnegie Institu ...
reported in a 2023 paper that generating one thousand 1024×1024 images using
Stable Diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022 based on Diffusion model, diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of ...
's XL 1.0 base model requires 11.49 kWh of energy and generates of
carbon dioxide Carbon dioxide is a chemical compound with the chemical formula . It is made up of molecules that each have one carbon atom covalent bond, covalently double bonded to two oxygen atoms. It is found in a gas state at room temperature and at norma ...
, which is roughly equivalent to driving an average gas-powered car a distance of . Comparing 88 different models, the paper concluded that image-generation models used on average around 2.9kWh of energy per 1,000
inferences Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word ''infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction ...
.


Analysis of existing art using AI

In addition to the creation of original art, research methods that use AI have been generated to quantitatively analyze digital art collections. This has been made possible due to the large-scale digitization of artwork in the past few decades. According to CETINIC and SHE (2022), using artificial intelligence to analyze already-existing art collections can provide new perspectives on the development of artistic styles and the identification of artistic influences. Two computational methods, close reading and distant viewing, are the typical approaches used to analyze digitized art. Close reading focuses on specific visual aspects of one piece. Some tasks performed by machines in close reading methods include computational artist authentication and analysis of brushstrokes or texture properties. In contrast, through distant viewing methods, the similarity across an entire collection for a specific feature can be statistically visualized. Common tasks relating to this method include automatic classification,
object detection Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched ...
, multimodal tasks, knowledge discovery in art history, and computational aesthetics. Synthetic images can also be used to train AI algorithms for art authentication and to detect forgeries. Researchers have also introduced models that predict emotional responses to art. One such model is ArtEmis, a large-scale dataset paired with machine learning models. ArtEmis includes emotional annotations from over 6,500 participants along with textual explanations. By analyzing both visual inputs and the accompanying text descriptions from this dataset, ArtEmis enables the generation of nuanced emotional predictions.


Other forms of AI art

AI has also been used in arts outside of visual arts. Generative AI has been used to create
music Music is the arrangement of sound to create some combination of Musical form, form, harmony, melody, rhythm, or otherwise Musical expression, expressive content. Music is generally agreed to be a cultural universal that is present in all hum ...
, as well as in video game production beyond imagery, especially for
level design In video games, a level (also referred to as a map, mission, stage, course, or round in some older games) is any space available to the player during the course of completion of an objective. Video game levels generally have progressively incre ...
(e.g., for custom maps) and creating new content (e.g., quests or dialogue) or interactive stories in video games. AI has also been used in the
literary arts Literature is any collection of written work, but it is also used more narrowly for writings specifically considered to be an art form, especially novels, plays, and poems. It includes both print and digital writing. In recent centuries, ...
, such as helping with
writer's block Writer's block is a non-medical condition, primarily associated with writing, in which an author is either unable to produce new work or experiences a creative slowdown. Writer's block has various degrees of severity, from difficulty in coming ...
, inspiration, or rewriting segments. In the culinary arts, some prototype cooking robots can dynamically
taste The gustatory system or sense of taste is the sensory system that is partially responsible for the perception of taste. Taste is the perception stimulated when a substance in the mouth biochemistry, reacts chemically with taste receptor cells l ...
, which can assist chefs in analyzing the content and flavor of dishes during the cooking process.


See also


References

{{Western art movements 20th-century introductions 20th-century art movements Generative artificial intelligence Visual arts Digital art Computer art Art controversies Works involved in plagiarism controversies Articles containing video clips