Human image synthesis is technology that can be applied to make believable and even
photorealistic
Photorealism is a genre of art that encompasses painting, drawing and other graphic media, in which an artist studies a photograph and then attempts to reproduce the image as realistically as possible in another medium. Although the term can b ...
renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using
computer generated imagery
Computer-generated imagery (CGI) is a specific-technology or application of computer graphics for creating or improving images in art, printed media, simulators, videos and video games. These images are either static (i.e. still images) or d ...
have featured synthetic images of human-like characters digitally composited onto the real or other simulated film material. Towards the end of the 2010s
deep learning
Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
has been applied to synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work
.
Timeline of human image synthesis
* In 1971
Henri Gouraud
Henri Gouraud (17 November 1867 - 16 September 1946) was a French army general. He played a central role in the colonization of French Africa and the Levant. During World War I, he fought in major battles such as those of the Argonne, the Dard ...
geometry
Geometry (; ) is a branch of mathematics concerned with properties of space such as the distance, shape, size, and relative position of figures. Geometry is, along with arithmetic, one of the oldest branches of mathematics. A mathematician w ...
capture and representation of a human face. Modeling was his wife Sylvie Gouraud. The 3D model was a simple
wire-frame model
In 3D computer graphics, a wire-frame model (also spelled wireframe model) is a visual representation of a three-dimensional (3D) physical object. It is based on a polygon mesh or a volumetric mesh, created by specifying each Edge (geometry ...
A Computer Animated Hand
''A Computer Animated Hand'' is the title of a 1972 American List of computer-animated films, computer-animated short film produced by Edwin Catmull and Fred Parke. Produced during Catmull's tenure at the University of Utah, the short was created ...
'' by
Edwin Catmull
Edwin Earl Catmull (born March 31, 1945) is an American computer scientist and animator who served as the co-founder of Pixar and the President of Walt Disney Animation Studios. He has been honored for his contributions to 3D computer graphics, ...
and
Fred Parke
Frederic Ira Parke is an American computer graphics researcher and academic. He did early work on animated computer renderings of human faces.
Parke graduated from the University of Utah with a BS degree in physics in 1965. He was then a gradu ...
was the first time that
computer-generated imagery
Computer-generated imagery (CGI) is a specific-technology or application of computer graphics for creating or improving images in Digital art, art, Publishing, printed media, Training simulation, simulators, videos and video games. These images ...
was used in film to simulate moving human appearance. The film featured a computer simulated hand and fac (watch film here)
* The 1976 film ''
Futureworld
''Futureworld'' is a 1976 American science fiction thriller film directed by Richard T. Heffron and written by Mayo Simon and George Schenck. It is a sequel to the 1973 Michael Crichton film '' Westworld'', and is the second installment in ...
Kraftwerk
Kraftwerk (, ) is a Germany, German Electronic music, electronic band formed in Düsseldorf in 1970 by Ralf Hütter and Florian Schneider. Widely considered innovators and pioneers of electronic music, Kraftwerk was among the first successful a ...
aired in 1986. Created by the artist Rebecca Allen, it features non-realistic looking, but clearly recognizable computer simulations of the band members.
* The 1994 film
The Crow
''The Crow'' is a supernatural superhero comic book series created by James O'Barr revolving around the titular character of the same name. The series, which was originally created by O'Barr as a means of dealing with the death of his fianc� ...
was the first film production to make use of digital compositing of a computer simulated representation of a face onto scenes filmed using a
body double
In filmmaking, a double is a person who substitutes for another actor such that the person's face is not shown. There are various terms associated with a double based on the specific body part or ability they serve as a double for, such as stunt ...
. Necessity was the muse as the actor
Brandon Lee
Brandon Bruce Lee (February 1, 1965 – March 31, 1993) was an American actor. Establishing himself as a rising action star in the early 1990s, he landed what was to be his breakthrough role as Eric Draven in the supernatural superhero fi ...
portraying the protagonist was tragically killed accidentally on-stage.
* In 1999 Paul Debevec et al. of
USC USC may refer to:
Education
United States
* Universidad del Sagrado Corazón, Santurce, Puerto Rico
* University of South Carolina, Columbia, South Carolina
** University of South Carolina System, a state university system of South Carolina
* ...
captured the reflectance field of a human face with their first version of a light stage. They presented their method at the
SIGGRAPH
SIGGRAPH (Special Interest Group on Computer Graphics and Interactive Techniques) is an annual conference centered around computer graphics organized by ACM, starting in 1974 in Boulder, CO. The main conference has always been held in North ...
2000
* In 2003
audience
An audience is a group of people who participate in a show or encounter a work of art, literature (in which they are called "readers"), theatre, music (in which they are called "listeners"), video games (in which they are called "players"), or ...
debut of photo realistic human-likenesses in the 2003 films ''
The Matrix Reloaded
''The Matrix Reloaded'' is a 2003 American science fiction action film written and directed by the Wachowskis. It is the sequel to ''The Matrix'' (1999) and the second installment in the ''Matrix'' film series. The film stars Keanu Reeves, L ...
Agent Smith
Agent Smith (later simply Smith) is a fictional character and the main antagonist of ''The Matrix'' franchise. The character was primarily portrayed by Hugo Weaving in the first trilogy of films and voiced by Christopher Corey Smith in '' The ...
The Matrix Revolutions
''The Matrix Revolutions'' is a 2003 American science fiction action film written and directed by the Wachowskis. The direct sequel to ''The Matrix Reloaded,'' it is the third installment in ''The Matrix'' film series, released six months foll ...
'' where at the start of the end showdown Agent Smith's
cheekbone
In the human skull, the zygomatic bone (from ), also called cheekbone or malar bone, is a paired irregular bone, situated at the upper and lateral part of the face and forming part of the lateral wall and floor of the orbit (anatomy), orbit, of t ...
gets punched in by Neo leaving the digital look-alike unnaturally unhurt. The Matrix Revolutions bonus DVD documents and depicts the process in some detail and the techniques used, including
facial motion capture
Facial motion capture is the process of electronically converting the movements of a person's face into a digital database using cameras or laser scanners. This database may then be used to produce computer graphics (CG), computer animation for mo ...
and
limb
Limb may refer to:
Science and technology
*Limb (anatomy), an appendage of a human or animal
*Limb, a large or main branch of a tree
*Limb, in astronomy, the curved edge of the apparent disk of a celestial body, e.g. lunar limb
*Limb, in botany, t ...
al
motion capture
Motion capture (sometimes referred as mocap or mo-cap, for short) is the process of recording high-resolution motion (physics), movement of objects or people into a computer system. It is used in Military science, military, entertainment, sports ...
, and
projection
Projection or projections may refer to:
Physics
* Projection (physics), the action/process of light, heat, or sound reflecting from a surface to another in a different direction
* The display of images by a projector
Optics, graphics, and carto ...
state-of-the-art
The state of the art (SOTA or SotA, sometimes cutting edge, leading edge, or bleeding edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contex ...
want-to-be human likenesses not quite fooling the watcher made by Square Pictures.
* In 2003 digital likeness of
Tobey Maguire
Tobias Vincent Maguire (born 27 June 1975) is an American actor and film producer. He is best known for starring as Peter Parker (2002 film series character), Spider-Man in Sam Raimi's Spider-Man in film#Sam Raimi's trilogy, ''Spider-Man'' tr ...
was made for movies ''
Spider-man 2
''Spider-Man 2'' is a 2004 American superhero film based on the Marvel Comics character of Spider-Man. Directed by Sam Raimi and written by Alvin Sargent from a story conceived by Michael Chabon and the writing team of Alfred Gough and Miles ...
'' and ''
Spider-man 3
''Spider-Man 3'' is a 2007 American superhero film based on the Marvel Comics character Spider-Man. Produced by Columbia Pictures, Marvel Entertainment, and Laura Ziskin Productions, and distributed by Sony Pictures Releasing, it was directe ...
'' by
Sony Pictures Imageworks
Sony Pictures Imageworks Inc. (simply known as Imageworks) is a visual effects and computer animation studio headquartered in Vancouver, British Columbia and Montreal, Quebec, with an additional office on the Sony Pictures Studios lot in Culver C ...
University of St Andrews
The University of St Andrews (, ; abbreviated as St And in post-nominals) is a public university in St Andrews, Scotland. It is the List of oldest universities in continuous operation, oldest of the four ancient universities of Scotland and, f ...
and Perception Lab, funded by the
EPSRC
The Engineering and Physical Sciences Research Council (EPSRC) is a British Research Council that provides government funding for grants to undertake research and postgraduate degrees in engineering and the physical sciences, mainly to univers ...
. The website contains a "Face Transformer", which enables users to transform their face into any
ethnicity
An ethnicity or ethnic group is a group of people with shared attributes, which they Collective consciousness, collectively believe to have, and long-term endogamy. Ethnicities share attributes like language, culture, common sets of ancestry, ...
and
age
Age or AGE may refer to:
Time and its effects
* Age, the amount of time someone has been alive or something has existed
** East Asian age reckoning, an Asian system of marking age starting at 1
* Ageing or aging, the process of becoming older
...
as well as the ability to transform their face into a painting (in the style of either
Sandro Botticelli
Alessandro di Mariano di Vanni Filipepi ( – May 17, 1510), better known as Sandro Botticelli ( ; ) or simply known as Botticelli, was an Italian painter of the Early Renaissance. Botticelli's posthumous reputation suffered until the late 1 ...
or
Amedeo Modigliani
Amedeo Clemente Modigliani (; ; 12 July 1884 – 24 January 1920) was an Italian painter and sculptor of the École de Paris who worked mainly in France. He is known for portraits and nudes in a modern art, modern style characterized by a surre ...
). This process is achieved by combining the user's photograph with an
average
In colloquial, ordinary language, an average is a single number or value that best represents a set of data. The type of average taken as most typically representative of a list of numbers is the arithmetic mean the sum of the numbers divided by ...
face.
* In 2009 Debevec et al. presented new digital likenesses, made by
Image Metrics
Image Metrics is a 3D facial animation and Virtual Try-on company headquartered in El Segundo, with offices in Las Vegas, and research facilities in Manchester. Image Metrics are the makers of the Live Driver and Portable You SDKs for softw ...
, this time of actress Emily O'Brien whose reflectance was captured with the USC light stage 5In this TED talk video at 00:04:59 you can see ''two clips, one with the real Emily shot with a real camera and one with a digital look-alike of Emily, shot with a simulation of a camera – Which is which is difficult to tell''. Bruce Lawmen was scanned using USC light stage 6 in still position and also recorded running there on a
treadmill
A treadmill is a device generally used for walking, running, or climbing while staying in the same place. Treadmills were introduced before the development of powered machines to harness the power of animals or humans to do work, often a type of ...
. Many, many digital look-alikes of Bruce are seen running fluently and natural looking at the ending sequence of the TED talk video. Motion looks fairly convincing contrasted to the clunky run in the ''Animatrix: Final Flight of the Osiris'' which was
state-of-the-art
The state of the art (SOTA or SotA, sometimes cutting edge, leading edge, or bleeding edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contex ...
in 2003 if photorealism was the intention of the animators.
* In 2009 a digital look-alike of a younger
Arnold Schwarzenegger
Arnold Alois Schwarzenegger (born July30, 1947) is an Austrian and American actor, businessman, former politician, and former professional bodybuilder, known for his roles in high-profile action films. Governorship of Arnold Schwarzenegger, ...
was made for the movie ''
Terminator Salvation
''Terminator Salvation'' is a 2009 American military science fiction action film that is the fourth installment of the ''Terminator'' franchise, serving as a sequel to '' Terminator 3: Rise of the Machines'' (2003). It is directed by McG an ...
'' though the end result was critiqued as unconvincing. Facial geometry was acquired from a 1984 mold of Schwarzenegger.
* In 2010
Walt Disney Pictures
Walt Disney Pictures is an American film Film production company, production company and subsidiary of Walt Disney Studios (division), the Walt Disney Studios, a division of Disney Entertainment, which is owned by the Walt Disney Company. The st ...
released a sci-fi sequel entitled '' Tron: Legacy'' with a digitally rejuvenated digital look-alike of actor
Jeff Bridges
Jeffrey Leon Bridges (born December 4, 1949) is an American actor. He is known for his Leading actor, leading man roles in film and television. In a career spanning over seven decades, he has received List of awards and nominations received by ...
playing the
antagonist
An antagonist is a character in a story who is presented as the main enemy or rival of the protagonist and is often depicted as a villain.CLU.
*In SIGGGRAPH 2013
Activision
Activision Publishing, Inc. is an American video game publisher based in Santa Monica, California. It serves as the publishing business for its parent company, Activision Blizzard, and consists of several subsidiary studios. Activision is one o ...
and USC presented a real time "Digital Ira" a digital face look-alike of Ari Shapiro, an ICT USC research scientist, utilizing the USC light stage X by Ghosh et al. for both reflectance field and motion capture. The end result both precomputed and real-time rendering with the modernest game
GPU
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
shown here and looks fairly realistic.
* In 2014 The Presidential Portrait by USC
Institute for Creative Technologies
The Institute for Creative Technologies (ICT) is a University Affiliated Research Center at the University of Southern California located in Playa Vista, California. ICT was established in 1999 with funding from the US Army.
Dr. Mike Andrew ...
in conjunction with the
Smithsonian Institution
The Smithsonian Institution ( ), or simply the Smithsonian, is a group of museums, Education center, education and Research institute, research centers, created by the Federal government of the United States, U.S. government "for the increase a ...
was made using the latest USC mobile light stage wherein President
Barack Obama
Barack Hussein Obama II (born August 4, 1961) is an American politician who was the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African American president in American history. O ...
had his geometry, textures and reflectance captured.
* In 2014 Ian Goodfellow et al. presented the principles of a
generative adversarial network
A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...
. GANs made the headlines in early 2018 with the
deepfake
''Deepfakes'' (a portmanteau of and ) are images, videos, or audio that have been edited or generated using artificial intelligence, AI-based tools or AV editing software. They may depict real or fictional people and are considered a form of ...
s controversies.
* For the 2015 film ''
Furious 7
''Furious 7'' (also known as ''Fast & Furious 7'') is a 2015 action film directed by James Wan and written by Chris Morgan. It is the sequel to ''Fast & Furious 6'' (2013) and '' The Fast and the Furious: Tokyo Drift'' (2006) and the seventh i ...
'' a digital look-alike of actor
Paul Walker
Paul William Walker IV (September 12, 1973 – November 30, 2013) was an American actor. He was best known for his role as Brian O'Conner in the ''Fast & Furious'' franchise.
Paul Walker began his career as a child actor in the 1980s, gainin ...
who died in an accident during the filming was done by Weta Digital to enable the completion of the film.
* In 2016 techniques which allow near real-time
counterfeiting
A counterfeit is a fake or unauthorized replica of a genuine product, such as money, documents, designer items, or other valuable goods. Counterfeiting generally involves creating an imitation of a genuine item that closely resembles the original ...
of
facial expressions
Facial expression is the motion and positioning of the muscles beneath the skin of the face. These movements convey the emotional state of an individual to observers and are a form of nonverbal communication. They are a primary means of conveying ...
in existing 2D video have been believably demonstrated.
* In 2016 a digital look-alike of
Peter Cushing
Peter Wilton Cushing (26 May 1913 – 11 August 1994) was an English actor. His acting career spanned over six decades and included appearances in more than 100 films, as well as many television, stage and radio roles. He achieved recognition f ...
was made for the ''
Rogue One
''Rogue One: A Star Wars Story'' is a 2016 American epic space opera film directed by Gareth Edwards and written by Chris Weitz and Tony Gilroy. Produced by Lucasfilm and distributed by Walt Disney Studios Motion Pictures, it is the first '' ...
'' film where its appearance would appear to be of same age as the actor was during the filming of the original 1977 ''
Star Wars
''Star Wars'' is an American epic film, epic space opera media franchise created by George Lucas, which began with the Star Wars (film), eponymous 1977 film and Cultural impact of Star Wars, quickly became a worldwide popular culture, pop cu ...
'' film.
* In SIGGRAPH 2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from
University of Washington
The University of Washington (UW and informally U-Dub or U Dub) is a public research university in Seattle, Washington, United States. Founded in 1861, the University of Washington is one of the oldest universities on the West Coast of the Uni ...
. It was driven only by a voice track as source data for the animation after the training phase to acquire
lip sync
Lip sync or lip synch (pronounced , like the word ''sink'', despite the Hard and soft C, spelling of the participial forms ''synced'' and ''syncing''), short for lip synchronization, is a technical term for matching a Speech, speaking or singin ...
and wider facial information from training material consisting 2D videos with audio had been completed.
* Late 2017 and early 2018 saw the surfacing of the
deepfake
''Deepfakes'' (a portmanteau of and ) are images, videos, or audio that have been edited or generated using artificial intelligence, AI-based tools or AV editing software. They may depict real or fictional people and are considered a form of ...
s controversy where porn videos were doctored using deep machine learning so that the face of the actress was replaced by the software's opinion of what another persons face would look like in the same pose and lighting.
* In 2018
Game Developers Conference
The Game Developers Conference (GDC) is an annual conference for video game developers. The event includes an expo, networking events, and awards shows like the Game Developers Choice Award for Game of the Year, Game Developers Choice Awards and ...
Epic Games
Epic Games, Inc. is an American Video game developer, video game and software development, software developer and video game publisher, publisher based in Cary, North Carolina. The company was founded by Tim Sweeney (game developer), Tim Sween ...
and
Tencent Games
Tencent Games () is the video game publishing subdivision of Tencent Interactive Entertainment, the digital entertainment division of Tencent Holdings. It has five internal studio groups, including TiMi Studio Group. Tencent Games was founde ...
demonstrated "Siren", a digital look-alike of the actress Bingjie Jiang. It was made possible with the following technologies: CubicMotion's
computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
system, 3Lateral's facial rigging system and Vicon's motion capture system. The demonstration ran in near real time at 60 frames per second in the
Unreal Engine 4
Unreal Engine 4 (UE4) is the fourth version of Unreal Engine developed by Epic Games. UE4 began development in 2003 and was released in March 2014, with the first game using UE4 being released in April 2014. UE4 introduced support for Physically ...
.
* In 2018 at the
World Internet Conference
The World Internet Conference (WIC; zh, s=世界互联网大会, labels=no), also known as the Wuzhen Summit ( zh, s=乌镇峰会, labels=no), is an annual event, first held in 2014, organized by the government of the People's Republic of China ...
in
Wuzhen
Wuzhen ( zh, s=乌镇, p=Wūzhèn, Wu Chinese, Wu: Whu-tsen lit. "Wu Town") is a historic scenic Town (China), town, part of Tongxiang, located in the north of Zhejiang, Zhejiang Province, China. It was primarily built in the 7th century during th ...
the
Xinhua News Agency
Xinhua News Agency (English pronunciation: ),J. C. Wells: Longman Pronunciation Dictionary, 3rd ed., for both British and American English or New China News Agency, is the official state news agency of the People's Republic of China. It is a ...
presented two digital look-alikes made to the resemblance of its real news anchors Qiu Hao (Chinese language) and Zhang Zhao (English language). The digital look-alikes were made in conjunction with
Sogou
Sogou, Inc. () is a Chinese technology company and subsidiary of Tencent.
The offices of Sogou are located in Beijing on the southeast corner of Tsinghua University. Sogou also has offices in Chengdu, co-located within Tencent's office build ...
. Neither the
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
used nor the gesturing of the digital look-alike anchors were good enough to deceive the watcher to mistake them for real humans imaged with a TV camera.
* In September 2018 Google added "involuntary synthetic pornographic imagery" to its ban list, allowing anyone to request the search engine block results that falsely depict them as "nude or in a sexually explicit situation."
* In February 2019
Nvidia
Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
generative adversarial network
A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...
. Right after this Phillip Wang made the website ThisPersonDoesNotExist.com with StyleGAN to demonstrate that unlimited amounts of often photo-realistic looking facial portraits of no-one can be made automatically using a GAN. Nvidia's StyleGAN was presented in a not yet
peer review
Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (:wiktionary:peer#Etymology 2, peers). It functions as a form of self-regulation by qualified members of a profession within the ...
ed paper in late 2018.
* At the June 2019 CVPR the
MIT
The Massachusetts Institute of Technology (MIT) is a private research university in Cambridge, Massachusetts, United States. Established in 1861, MIT has played a significant role in the development of many areas of modern technology and sc ...
CSAIL
Computer Science and Artificial Intelligence Laboratory (CSAIL) is a research institute at the Massachusetts Institute of Technology (MIT) formed by the 2003 merger of the Laboratory for Computer Science (LCS) and the Artificial Intelligence Lab ...
presented a system titled ''"Speech2Face: Learning the Face Behind a Voice"'' that synthesizes likely faces based on just a recording of a voice. It was trained with massive amounts of video of people speaking.
* Since 1 July 2019
Virginia
Virginia, officially the Commonwealth of Virginia, is a U.S. state, state in the Southeastern United States, Southeastern and Mid-Atlantic (United States), Mid-Atlantic regions of the United States between the East Coast of the United States ...
Code of Virginia
The Code of Virginia is the statutory law of the U.S. state of Virginia and consists of the codified legislation of the Virginia General Assembly. The 1950 Code of Virginia is the revision currently in force. The previous official versions were t ...
. The law text states: "''Any person who, with the
intent
An intention is a mental state in which a person commits themselves to a course of action. Having the plan to visit the zoo tomorrow is an example of an intention. The action plan is the ''content'' of the intention while the commitment is the '' ...
harass
Harassment covers a wide range of behaviors of an offensive nature. It is commonly understood as behavior that demeans, humiliates, and intimidates a person, and it is characteristically identified by its unlikelihood in terms of social and ...
, or
intimidate
Intimidation is a behaviour and legal wrong which usually involves deterring or coercing an individual by threat of violence. It is in various jurisdictions a crime and a civil wrong (tort). Intimidation is similar to menacing, coercion, terror ...
, maliciously disseminates or sells any videographic or still image created by any means whatsoever that depicts another person who is totally nude, or in a state of undress so as to expose the
genitals
A sex organ, also known as a reproductive organ, is a part of an organism that is involved in sexual reproduction. Sex organs constitute the primary sex characteristics of an organism. Sex organs are responsible for producing and transporting ...
, pubic area,
buttocks
The buttocks (: buttock) are two rounded portions of the exterior anatomy of most mammals, located on the posterior of the pelvic region. In humans, the buttocks are located between the lower back and the perineum. They are composed of a lay ...
, or female
breast
The breasts are two prominences located on the upper ventral region of the torso among humans and other primates. Both sexes develop breasts from the same embryology, embryological tissues. The relative size and development of the breasts is ...
, where such person knows or has reason to know that he is not
license
A license (American English) or licence (Commonwealth English) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit).
A license is granted by a party (licensor) to another part ...
d or authorized to disseminate or sell such videographic or still image is guilty of a Class 1
misdemeanor
A misdemeanor (American English, spelled misdemeanour elsewhere) is any "lesser" criminal act in some common law legal systems. Misdemeanors are generally punished less severely than more serious felonies, but theoretically more so than admi ...
.''". The identical bills were House Bill 2678 presented by DelegateMarcus Simon to the
Virginia House of Delegates
The Virginia House of Delegates is one of the two houses of the Virginia General Assembly, the other being the Senate of Virginia. It has 100 members elected for terms of two years; unlike most states, these elections take place during odd-numbe ...
on 14 January 2019 and three-day later an identical Senate bill 1736 was introduced to the
Senate of Virginia
The Senate of Virginia is the upper house of the Virginia General Assembly. The Senate is composed of 40 senators representing an equal number of single-member constituent districts. The Senate is presided over by the lieutenant governor of Vir ...
Texas
Texas ( , ; or ) is the most populous U.S. state, state in the South Central United States, South Central region of the United States. It borders Louisiana to the east, Arkansas to the northeast, Oklahoma to the north, New Mexico to the we ...
senate bill SB 751
amendment
An amendment is a formal or official change made to a law, contract, constitution, or other legal document. It is based on the verb to amend, which means to change for better. Amendments can add, remove, or update parts of these agreements. They ...
s to the election code came into effect, giving
candidates
A candidate, or nominee, is a prospective recipient of an award or honor, or a person seeking or being considered for some kind of position. For example, one can be a candidate for membership in a group or election to an office, in which case a ...
in
elections
An election is a formal group decision-making process whereby a population chooses an individual or multiple individuals to hold public office.
Elections have been the usual mechanism by which modern representative democracy has operated ...
a 30-day protection period to the elections during which making and distributing digital look-alikes or synthetic fakes of the candidates is an offense. The law text defines the subject of the law as "''a video, created with the intent to deceive, that appears to depict a real person performing an action that did not occur in reality''"
* In September 2019
Yle
Yleisradio Oy (; ), abbreviated as Yle () (formerly styled in all uppercase until 2012), translated into English as the Finnish Broadcasting Company, is Finland's national public broadcasting company, founded in 1926. It is a joint-stock comp ...
, the Finnish public broadcasting company, aired a result of experimental journalism, a deepfake of the President in office
Sauli Niinistö
Sauli Väinämö Niinistö (, born 24 August 1948) is a Finnish politician who served as the 12th president of Finland from 2012 to 2024.
A lawyer by education, Niinistö was Chairman of the National Coalition Party (NCP) from 1994 to 2001, Mini ...
in its main news broadcast for the purpose of highlighting the advancing disinformation technology and problems that arise from it.
* 1 January 2020 California the
state law State law refers to the law of a federated state, as distinguished from the law of the federation of which it is a part. It is used when the constituent components of a federation are themselves called states. Federations made up of provinces, cant ...
AB-602 came into effect banning the manufacturing and
distribution Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
*Probability distribution, the probability of a particular value or value range of a varia ...
of synthetic pornography without the
consent
Consent occurs when one person voluntarily agrees to the proposal or desires of another. It is a term of common speech, with specific definitions used in such fields as the law, medicine, research, and sexual consent. Consent as understood i ...
of the people depicted. AB-602 provides victims of synthetic pornography with injunctive relief and poses legal threats of
statutory
A statute is a law or formal written enactment of a legislature. Statutes typically declare, command or prohibit something. Statutes are distinguished from court law and unwritten law (also known as common law) in that they are the expressed wil ...
and
punitive damages
Punitive damages, or exemplary damages, are damages assessed in order to punish the defendant for outrageous conduct and/or to reform or deter the defendant and others from engaging in conduct similar to that which formed the basis of the lawsuit. ...
on criminals making or distributing synthetic pornography without consent. The bill AB-602 was signed into law by California
Governor
A governor is an politician, administrative leader and head of a polity or Region#Political regions, political region, in some cases, such as governor-general, governors-general, as the head of a state's official representative. Depending on the ...
Gavin Newsom
Gavin Christopher Newsom ( ; born October 10, 1967) is an American politician and businessman serving since 2019 as the 40th governor of California. A member of the Democratic Party (United States), Democratic Party, he served from 2011 to 201 ...
on 3 October 2019 and was authored by
California State Assembly
The California State Assembly is the lower house of the California State Legislature (the upper house being the California State Senate). The Assembly convenes, along with the State Senate, at the California State Capitol in Sacramento, Califor ...
member Marc Berman.
* 1 January 2020, Chinese law requiring that synthetically faked footage should bear a clear notice about its fakeness came into effect. Failure to comply could be considered a crime the
Cyberspace Administration of China
The Cyberspace Administration of China (CAC; ) is the national internet regulator and censor of the People's Republic of China.
The agency was initially established in 2011 by the State Council as the State Internet Information Office (SIIO) ...
stated on its website. China announced this new law in November 2019. The Chinese government seems to be reserving the right to prosecute both users and
online video platform
An online video platform (OVP) enables users to upload, convert, store, and play back video content on the Internet, often via a private server structured, large-scale system that may generate revenue. Users will generally upload video content vi ...
s failing to abide by the rules.12 November eepfake
Key breakthrough to photorealism: reflectance capture
In 1999 Paul Debevec et al. of USC did the first known reflectance capture over the human face with their extremely simple light stage. They presented their method and results in
SIGGRAPH
SIGGRAPH (Special Interest Group on Computer Graphics and Interactive Techniques) is an annual conference centered around computer graphics organized by ACM, starting in 1974 in Boulder, CO. The main conference has always been held in North ...
2000.
The scientific breakthrough required finding the subsurface light component (the simulation models are glowing from within slightly) which can be found using knowledge that light that is reflected from the oil-to-air layer retains its polarization and the subsurface light loses its polarization. So equipped only with a movable light source, movable video camera, 2 polarizers and a computer program doing extremely simple math and the last piece required to reach photorealism was acquired.
For a believable result both light
reflected
Reflection is the change in direction of a wavefront at an interface between two different media so that the wavefront returns into the medium from which it originated. Common examples include the reflection of light, sound and water waves. The ...
from skin (
BRDF
The bidirectional reflectance distribution function (BRDF), symbol f_(\omega_,\, \omega_), is a function of four real variables that defines how light from a source is reflected off an Opacity (optics), opaque surface. It is employed in the optic ...
) and within the skin (a special case of BTDF) which together make up the BSDF must be captured and simulated.
Capturing
* The 3D
geometry
Geometry (; ) is a branch of mathematics concerned with properties of space such as the distance, shape, size, and relative position of figures. Geometry is, along with arithmetic, one of the oldest branches of mathematics. A mathematician w ...
model
A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin , .
Models can be divided in ...
by a
3D reconstruction
In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects.
This process can be accomplished either by active or passive methods. If the model is allowed to change its shape i ...
3D scanning
3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models.
A 3D scanner ...
with an
RGB
The RGB color model is an additive color model in which the red, green, and blue primary colors of light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three ...
Cyberware
Cyberware refers to technology that integrates directly with the human nervous system, typically through implants or interfaces that enable communication between machines and the body.
Once largely a concept within ''science fiction'', cyberwar ...
photos
A photograph (also known as a photo, or more generically referred to as an ''image'' or ''picture'') is an image created by light falling on a photosensitive surface, usually photographic film or an electronic image sensor. The process and prac ...
.
Digital sculpting
Digital sculpting, also known as sculpt modeling or 3D sculpting, is the use of software that offers tools to push, pull, smooth, grab, pinch or otherwise manipulate a digital object as if it were made of a real-life substance such as clay.
Sculp ...
can be used to make up models of the body parts for which data cannot be acquired e.g. parts of the body covered by clothing.
* For believable results also the reflectance field must be captured or an approximation must be picked from the libraries to form a 7D reflectance model of the target.
Synthesis
The whole process of making digital look-alikes i.e. characters so lifelike and realistic that they can be passed off as pictures of humans is a very complex task as it requires photorealistically modeling, animating, cross-mapping, and rendering the
soft body dynamics
Soft-body dynamics is a field of computer graphics that focuses on visually realistic physical simulations of the motion and properties of deformable objects (or ''soft bodies''). The applications are mostly in video games and films. Unlike in si ...
of the human appearance.
Synthesis with an actor and suitable
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s is applied using powerful computers. The actor's part in the synthesis is to take care of mimicking human expressions in still picture synthesizing and also human movement in motion picture synthesizing. Algorithms are needed to simulate laws of
physics
Physics is the scientific study of matter, its Elementary particle, fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge whi ...
and
physiology
Physiology (; ) is the science, scientific study of function (biology), functions and mechanism (biology), mechanisms in a life, living system. As a branches of science, subdiscipline of biology, physiology focuses on how organisms, organ syst ...
and to map the models and their appearance, movements and interaction accordingly.
Often both
physics
Physics is the scientific study of matter, its Elementary particle, fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge whi ...
/
physiology
Physiology (; ) is the science, scientific study of function (biology), functions and mechanism (biology), mechanisms in a life, living system. As a branches of science, subdiscipline of biology, physiology focuses on how organisms, organ syst ...
based (i.e.
skeletal animation
Skeletal animation or rigging is a technique in computer animation in which a character (or other articulated object) is represented in two parts: a polygonal or parametric mesh representation of the surface of the object, and a hierarchical set ...
) and
image-based modeling and rendering
In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene.
The traditional ...
are employed in the synthesis part. Hybrid models employing both approaches have shown best results in realism and ease-of-use. Morph target animation reduces the workload by giving higher level control, where different facial expressions are defined as deformations of the model, which facial allows expressions to be tuned intuitively. Morph target animation can then morph the model between different defined facial expressions or body poses without much need for human intervention.
Using
displacement mapping
Displacement mapping is an alternative computer graphics technique in contrast to bump, normal, and parallax mapping, using a texture or height map to cause an effect where the actual geometric position of points over the textured surface are ...
plays an important part in getting a realistic result with fine detail of skin such as
pore
Pore may refer to:
Biology Animal biology and microbiology
* Sweat pore, an anatomical structure of the skin of humans (and other mammals) used for secretion of sweat
* Hair follicle, an anatomical structure of the skin of humans (and other ...
s and
wrinkle
A wrinkle, also known as a rhytid, is a fold, ridge or crease in an otherwise smooth surface, such as on skin or fabric. Skin wrinkles typically appear as a result of ageing processes such as glycation, habitual sleeping positions, loss of b ...
s as small as 100
μm
The micrometre (Commonwealth English as used by the International Bureau of Weights and Measures; SI symbol: μm) or micrometer (American English), also commonly known by the non-SI term micron, is a unit of length in the International System ...
.
Machine learning approach
In the late 2010s,
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
, and more precisely
generative adversarial networks
A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...
(GAN), were used by
NVIDIA
Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
to produce random yet photorealistic human-like portraits. The system, named StyleGAN, was trained on a database of 70,000 images from the images depository website
Flickr
Flickr ( ) is an image hosting service, image and Online video platform, video hosting service, as well as an online community, founded in Canada and headquartered in the United States. It was created by Ludicorp in 2004 and was previously a co ...
. The source code was made public on
GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
in 2019. Outputs of the generator network from random input were made publicly available on a number of websites.
Similarly, since 2018,
deepfake
''Deepfakes'' (a portmanteau of and ) are images, videos, or audio that have been edited or generated using artificial intelligence, AI-based tools or AV editing software. They may depict real or fictional people and are considered a form of ...
technology has allowed GANs to swap faces between actors; combined with the ability to fake voices, GANs can thus generate fake videos that seem convincing.
Applications
Main applications fall within the domains of
stock photography
Stock photography is the supply of photographs that are often licensed for specific uses. The stock photo industry, which began to gain hold in the 1920s, has established models including traditional macrostock photography, midstock photography, ...
,
synthetic data
Synthetic data are artificially generated rather than produced by real-world events. Typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models.
Data generated by a comp ...
sets,
virtual cinematography
Virtual cinematography is the set of Cinematography, cinematographic techniques performed in a computer graphics environment. It includes a wide variety of subjects like photographing real objects, often with Stereo camera, stereo or multi-camer ...
, computer and
video games
A video game or computer game is an electronic game that involves interaction with a user interface or input device (such as a joystick, game controller, controller, computer keyboard, keyboard, or motion sensing device) to generate visual fe ...
and
covert
Secrecy is the practice of hiding information from certain individuals or groups who do not have the "need to know", perhaps while sharing it with other individuals. That which is kept hidden is known as the secret.
Secrecy is often controver ...
disinformation
Disinformation is misleading content deliberately spread to deceive people, or to secure economic or political gain and which may cause public harm. Disinformation is an orchestrated adversarial activity in which actors employ strategic dece ...
attacks. Some facial-recognition AI use images generated by other AI as
synthetic data
Synthetic data are artificially generated rather than produced by real-world events. Typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models.
Data generated by a comp ...
for training.
Furthermore, some research suggests that it can have therapeutic effects as "
psychologist
A psychologist is a professional who practices psychology and studies mental states, perceptual, cognitive, emotional, and social processes and behavior. Their work often involves the experimentation, observation, and explanation, interpretatio ...
s and counselors have also begun using avatars to deliver therapy to clients who have
phobias
The English language, English suffixes -phobia, -phobic, -phobe (from Ancient Greek, Greek φόβος ''phobos'', "fear") occur in technical usage in psychiatry to construct words that describe irrational, abnormal, unwarranted, persistent, o ...
social anxiety
Social anxiety is the anxiety and fear specifically linked to being in social settings (i.e., interacting with others). Some categories of disorders associated with social anxiety include anxiety disorders, mood disorders, autism spectrum dis ...
." The strong memory imprint and brain activation effects caused by watching a digital look-alike avatar of yourself is dubbed the
Doppelgänger
A doppelgänger ( ), sometimes spelled doppelgaenger or doppelganger, is a ghostly double of a living person, especially one that haunts its own fleshly counterpart.
In fiction and mythology, a doppelgänger is often portrayed as a ghostly or p ...
effect. The doppelgänger effect can heal when covert disinformation attack is exposed as such to the targets of the attack.
Related issues
The
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
has been verging on being completely indistinguishable from a recording of a real human's voice since the 2016 introduction of the voice editing and generation software Adobe Voco, a prototype slated to be a part of the
Adobe Creative Suite
Adobe Creative Suite (CS) is a discontinued software suite of graphic design, video editing, and web development application software, applications developed by Adobe Systems.
The last of the Creative Suite versions, Adobe Creative Suite 6 (CS6) ...
and
DeepMind
DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British–American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Go ...
WaveNet, a prototype from Google.
Ability to steal and manipulate other peoples voices raises obvious ethical concerns.
At the 2018
Conference on Neural Information Processing Systems
The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. Along with ICLR and ICML, it is one of the thre ...
(NeurIPS) researchers from Google presented the work 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis', which transfers learning from
speaker verification
Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...
to achieve text-to-speech synthesis, that can be made to sound almost like anybody from a speech sample of only 5 second (listen)
Sourcing images for AI training raises a question of privacy as people who are used for training didn't consent.Digital sound-alikes technology found its way to the hands of criminals as in 2019 Symantec researchers knew of 3 cases where technology has been used for crime.
This coupled with the fact that (as of 2016) techniques which allow near real-time
counterfeiting
A counterfeit is a fake or unauthorized replica of a genuine product, such as money, documents, designer items, or other valuable goods. Counterfeiting generally involves creating an imitation of a genuine item that closely resembles the original ...
of
facial expressions
Facial expression is the motion and positioning of the muscles beneath the skin of the face. These movements convey the emotional state of an individual to observers and are a form of nonverbal communication. They are a primary means of conveying ...
in existing 2D video have been believably demonstrated increases the stress on the disinformation situation.
Propaganda techniques
Propaganda techniques are methods used in propaganda to convince an audience to believe what the propagandist wants them to believe. Many propaganda techniques are based on social psychology, socio-psychological research. Many of these same tech ...
*
3D data acquisition and object reconstruction
3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models.
A 3D scanner ...
*
3D reconstruction from multiple images
3D reconstruction from multiple images is the creation of three-dimensional models from a set of images. It is the reverse process of obtaining 2D images from 3D scenes.
The essence of an image is to project a 3D scene onto a 2D plane, du ...
*
3D pose estimation
3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose or transformation of an object can be used fo ...
in general and
articulated body pose estimation
An articulated vehicle is a vehicle which has a permanent or semi-permanent coupling in its construction. This coupling works as a large pivot joint, allowing it to bend and turn more sharply. There are many kinds, from heavy equipment to buse ...
especially to do with capturing human likeness.
* 4D reconstruction
*
Finger tracking
In the field of gesture recognition and image processing, finger tracking is a high-resolution technique that is employed to know the position of a user's fingers in three-dimensional space. It was first developed in 1969.
Finger tracking can ...
*
Gesture recognition
Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to ...