
A graphics processing unit (GPU) is a specialized
electronic circuit
An electronic circuit is composed of individual electronic components, such as resistors, transistors, capacitors, inductors and diodes, connected by conductive wires or traces through which electric current can flow. It is a type of electri ...
designed to manipulate and alter
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered ...
to accelerate the creation of
images
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimension ...
in a
frame buffer
A framebuffer (frame buffer, or sometimes framestore) is a portion of random-access memory (RAM) containing a bitmap that drives a video display. It is a memory buffer containing data representing all the pixels in a complete video frame. Moder ...
intended for output to a
display device. GPUs are used in
embedded system
An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
s,
mobile phone
A mobile phone, cellular phone, cell phone, cellphone, handphone, hand phone or pocket phone, sometimes shortened to simply mobile, cell, or just phone, is a portable telephone that can make and receive telephone call, calls over a radio freq ...
s,
personal computer
A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tech ...
s,
workstation
A workstation is a special computer designed for technical or scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating systems. The term ''worksta ...
s, and
game console
A video game console is an electronic device that outputs a video signal or image to display a video game that can be played with a game controller. These may be home consoles, which are generally placed in a permanent location connected to ...
s.
Modern GPUs are efficient at manipulating
computer graphics
Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and computer displays, and many specialized applications. A great deal ...
and
image processing
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimension ...
. Their
parallel structure makes them more efficient than general-purpose
central processing unit
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
s (CPUs) for
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s that process large blocks of data in parallel. In a personal computer, a GPU can be present on a
video card
A graphics card (also called a video card, display card, graphics adapter, VGA card/VGA, video adapter, display adapter, or mistakenly GPU) is an expansion card which generates a feed of output images to a display device, such as a computer mo ...
or embedded on the
motherboard
A motherboard (also called mainboard, main circuit board, mb, mboard, backplane board, base board, system board, logic board (only in Apple computers) or mobo) is the main printed circuit board (PCB) in general-purpose computers and other expand ...
. In some CPUs, they are embedded on the CPU
die.
In the 1970s, the term "GPU" originally stood for ''graphics processor unit'' and described a programmable processing unit independently working from the CPU and responsible for graphics manipulation and output. Later, in 1994,
Sony
, commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
used the term (now standing for ''graphics processing unit'') in reference to the
PlayStation
is a video gaming brand that consists of five home video game consoles, two handhelds, a media center, and a smartphone, as well as an online service and multiple magazines. The brand is produced by Sony Interactive Entertainment, a di ...
console's
Toshiba
, commonly known as Toshiba and stylized as TOSHIBA, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. Its diversified products and services include power, industrial and social infrastructure systems ...
-designed
Sony GPU in 1994.
The term was popularized by
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
in 1999, who marketed the
GeForce 256
The GeForce 256 is the original release in Nvidia's " GeForce" product-line. Announced on August 31, 1999 and released on October 11, 1999, the GeForce 256 improves on its predecessor ( RIVA TNT2) by increasing the number of fixed pixel pipeli ...
as "the world's first GPU". It was presented as a "single-chip
processor with integrated
transform, lighting, triangle setup/clipping, and rendering engines". Rival
ATI Technologies
ATI Technologies Inc. (commonly called ATI) was a Canadian semiconductor technology corporation based in Markham, Ontario, that specialized in the development of graphics processing units and chipsets. Founded in 1985 as Array Technology Inc., ...
coined the term "visual processing unit" or VPU with the release of the
Radeon 9700
The R300 GPU, introduced in August 2002 and developed by ATI Technologies, is its third generation of GPU used in ''Radeon'' graphics cards. This GPU features 3D acceleration based upon Direct3D 9.0 and OpenGL 2.0, a major improvement in feat ...
in 2002.
History
1970s
Arcade system board
An arcade video game takes player input from its controls, processes it through electrical or computerized components, and displays output to an electronic monitor or similar display. Most arcade video games are coin-operated, housed in an arc ...
s have been using specialized graphics circuits since the 1970s. In early video game hardware, the
RAM
Ram, ram, or RAM may refer to:
Animals
* A male sheep
* Ram cichlid, a freshwater tropical fish
People
* Ram (given name)
* Ram (surname)
* Ram (director) (Ramsubramaniam), an Indian Tamil film director
* RAM (musician) (born 1974), Dutch
...
for frame buffers was expensive, so video chips composited data together as the display was being scanned out on the monitor.
A specialized
barrel shifter
A barrel shifter is a digital circuit that can shift a data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. It can however in theory als ...
circuit was used to help the CPU animate the
framebuffer
A framebuffer (frame buffer, or sometimes framestore) is a portion of random-access memory (RAM) containing a bitmap that drives a video display. It is a memory buffer containing data representing all the pixels in a complete video frame. Moder ...
graphics for various 1970s
arcade games from
Midway and
Taito
is a Japanese company that specializes in video games, toys, arcade cabinets and game centers, based in Shinjuku, Tokyo. The company was founded by Michael Kogan in 1953 as the importing vodka, vending machines and jukeboxes into Japan. It ...
, such as ''
Gun Fight'' (1975), ''
Sea Wolf'' (1976) and ''
Space Invaders
is a 1978 shoot 'em up arcade game developed by Tomohiro Nishikado. It was manufactured and sold by Taito in Japan, and licensed to the Midway division of Bally for overseas distribution. ''Space Invaders'' was the first fixed shooter and ...
'' (1978). The
Namco Galaxian
Namco was a video game developer and publisher, originally from Japan.
Bandai Namco Entertainment is the successor to Namco and continues manufacturing and distributing video games worldwide. For Namco games released following the 2006 merger wit ...
arcade system in 1979 used specialized
graphics hardware supporting
RGB color
The RGB color model is an additive color model in which the red, green and blue primary colors of light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three addi ...
, multi-colored sprites and
tilemap backgrounds. The Galaxian hardware was widely used during the
golden age of arcade video games
The golden age of arcade video games was the period of rapid growth, technological development and cultural influence of arcade video games, from the late 1970s to the early 1980s. The period began with the release of ''Space Invaders'' in 1978, ...
, by game companies such as
Namco
was a Japanese multinational corporation, multinational video game and entertainment company, headquartered in Ōta, Tokyo. It held several international branches, including Namco America in Santa Clara, California, Namco Europe in London, Na ...
,
Centuri
Centuri, formerly known as Allied Leisure, was an American arcade game manufacturer. They were based in Hialeah, Florida, and were one of the top six suppliers of coin-operated arcade video game machinery in the United States during the early 19 ...
,
Gremlin
A gremlin is a mischievous folkloric creature invented at the beginning of the 20th century to originally explain malfunctions in aircraft and later in other machinery and processes and their operators. Depictions of these creatures vary widely ...
,
Irem
is a Japanese video game console developer and publisher, and formerly a developer and manufacturer of arcade games as well. The company has its headquarters in Chiyoda, Tokyo.
The full name of the company that uses the brand is Irem Softw ...
,
Konami
, is a Japanese multinational video game and entertainment company headquartered in Chūō, Tokyo, it also produces and distributes trading cards, anime, tokusatsu, pachinko machines, slot machines, and arcade cabinets. Konami has casi ...
,
Midway,
Nichibutsu,
Sega and
Taito
is a Japanese company that specializes in video games, toys, arcade cabinets and game centers, based in Shinjuku, Tokyo. The company was founded by Michael Kogan in 1953 as the importing vodka, vending machines and jukeboxes into Japan. It ...
.

In the home market, the
Atari 2600
The Atari 2600, initially branded as the Atari Video Computer System (Atari VCS) from its release until November 1982, is a home video game console developed and produced by Atari, Inc. Released in September 1977, it popularized microprocess ...
in 1977 used a video shifter called the
Television Interface Adaptor
The Television Interface Adaptor (TIA) is the custom computer chip, along with a variant of the MOS Technology 6502 constituting the heart of the 1977 Atari Video Computer System game console. The TIA generates the screen display, sound effect ...
. The
Atari 8-bit computers (1979) had
ANTIC
Alphanumeric Television Interface Controller (ANTIC) is an LSI ASIC dedicated to generating 2D computer graphics to be shown on a television screen or computer display. Under the direction of Jay Miner, the chip was designed in 1977-1978 by ...
, a video processor which interpreted instructions describing a "display list"—the way the scan lines map to specific
bitmapped
file:Rgb-raster-image.svg, upright=1, The Smiley, smiley face in the top left corner is a raster image. When enlarged, individual pixels appear as squares. Enlarging further, each pixel can be analyzed, with their colors constructed through comb ...
or character modes and where the memory is stored (so there did not need to be a contiguous frame buffer).
6502
The MOS Technology 6502 (typically pronounced "sixty-five-oh-two" or "six-five-oh-two") William Mensch and the moderator both pronounce the 6502 microprocessor as ''"sixty-five-oh-two"''. is an 8-bit microprocessor that was designed by a small te ...
machine code
In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
subroutine
In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.
Functions ma ...
s could be triggered on
scan line
A scan line (also scanline) is one line, or row, in a raster scanning pattern, such as a line of video on a cathode ray tube (CRT) display of a television set or computer monitor.
On CRT screens the horizontal scan lines are visually discerni ...
s by setting a bit on a display list instruction. ANTIC also supported smooth
vertical
Vertical is a geometric term of location which may refer to:
* Vertical direction, the direction aligned with the direction of the force of gravity, up or down
* Vertical (angles), a pair of angles opposite each other, formed by two intersecting s ...
and
horizontal scrolling independent of the CPU.
1980s
The
NEC µPD7220
The High-Performance Graphics Display Controller 7220 (commonly μPD7220 or NEC 7220) is a video display processor capable of drawing lines, circles, arcs, and character graphics to a bit-mapped display. It was developed by NEC in order to suppo ...
was the first implementation of a PC graphics display processor as a single
Large Scale Integration
An integrated circuit or monolithic integrated circuit (also referred to as an IC, a chip, or a microchip) is a set of electronic circuits on one small flat piece (or "chip") of semiconductor material, usually silicon. Large numbers of tiny M ...
(LSI)
integrated circuit chip, enabling the design of low-cost, high-performance video graphics cards such as those from
Number Nine Visual Technology. It became the best-known GPU up until the mid-1980s. It was the first fully integrated
VLSI
Very large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining millions or billions of MOS transistors onto a single chip. VLSI began in the 1970s when MOS integrated circuit (Metal Oxide Semiconductor) ...
(very large-scale integration)
metal-oxide-semiconductor (
NMOS) graphics display processor for PCs, supported up to
1024x1024 resolution, and laid the foundations for the emerging PC graphics market. It was used in a number of graphics cards and was licensed for clones such as the Intel 82720, the first of
Intel's graphics processing units. The Williams Electronics arcade games ''
Robotron 2084'', ''
Joust
Jousting is a martial game or hastilude between two horse riders wielding lances with blunted tips, often as part of a tournament. The primary aim was to replicate a clash of heavy cavalry, with each participant trying to strike the opponent ...
'', ''
Sinistar
''Sinistar'' is a 1983 multidirectional shooter arcade game developed and manufactured by Williams Electronics. It was created by Sam Dicker, Jack Haeger, Noah Falstein, RJ Mical, Python Anghelo, and Richard Witt. Players control a spacecraft p ...
'', and ''
Bubbles'', all released in 1982, contain custom
blitter
A blitter is a circuit, sometimes as a coprocessor or a logic block on a microprocessor, dedicated to the rapid movement and modification of data within a computer's memory. A blitter can copy large quantities of data from one memory area to a ...
chips for operating on 16-color bitmaps.
In 1984,
Hitachi
() is a Japanese multinational corporation, multinational Conglomerate (company), conglomerate corporation headquartered in Chiyoda, Tokyo, Japan. It is the parent company of the Hitachi Group (''Hitachi Gurūpu'') and had formed part of the Ni ...
released ARTC HD63484, the first major
CMOS graphics processor for PC. The ARTC was capable of displaying up to
4K resolution
4K resolution refers to a horizontal display resolution of approximately 4,000 pixels. Digital television and digital cinematography commonly use several different 4K resolutions. In television and consumer media, 38402160 (4K UHD) is the domina ...
when in
monochrome
A monochrome or monochromatic image, object or palette is composed of one color (or values of one color). Images using only shades of grey are called grayscale (typically digital) or black-and-white (typically analog). In physics, monochr ...
mode, and it was used in a number of PC graphics cards and terminals during the late 1980s. In 1985, the
Commodore Amiga
Amiga is a family of personal computers introduced by Commodore in 1985. The original model is one of a number of mid-1980s computers with 16- or 32-bit processors, 256 KB or more of RAM, mouse-based GUIs, and significantly improved grap ...
featured a custom graphics chip, with a
blitter unit accelerating bitmap manipulation, line draw, and area fill functions. Also included is a
coprocessor with its own simple instruction set, capable of manipulating graphics hardware registers in sync with the video beam (e.g. for per-scanline palette switches, sprite multiplexing, and hardware windowing), or driving the blitter. In 1986,
Texas Instruments
Texas Instruments Incorporated (TI) is an American technology company headquartered in Dallas, Texas, that designs and manufactures semiconductors and various integrated circuits, which it sells to electronics designers and manufacturers globa ...
released the
TMS34010
The TMS34010, developed by Texas Instruments and released in 1986, was the first programmable graphics processor integrated circuit. While specialized graphics hardware existed earlier, such as blitters, the TMS34010 chip is a microprocessor ...
, the first fully programmable graphics processor. It could run general-purpose code, but it had a graphics-oriented instruction set. During 1990–1992, this chip became the basis of the
Texas Instruments Graphics Architecture
Texas Instruments Graphics Architecture (TIGA) is a graphics interface standard created by Texas Instruments that defined the software interface to graphics processors. Using this standard, any software written for TIGA should work correctly ...
("TIGA")
Windows accelerator
A Windows accelerator was a type of Graphics processing unit for personal computers with additional acceleration features like 2D line-drawings, blitter, clipping, font caching, hardware cursor support, color expansion, linear addressing, and pat ...
cards.

In 1987, the
IBM 8514 graphics system was released as one of the first video cards for
IBM PC compatible
IBM PC compatible computers are similar to the original IBM PC, XT, and AT, all from computer giant IBM, that are able to use the same software and expansion cards. Such computers were referred to as PC clones, IBM clones or IBM PC clones ...
s to implement
fixed-function
Fixed-function is a term canonically used to contrast 3D graphics APIs and earlier GPUs designed prior to the advent of shader-based 3D graphics APIs and GPU architectures.
History
Historically fixed-function APIs consisted of a set of functi ...
2D primitives in
electronic hardware
Electronic hardware consists of interconnected electronic components which perform analog or logic operations on received and locally stored information to produce as output or store resulting new information or to provide control for output actu ...
.
Sharp's
X68000
The is a home computer created by Sharp Corporation. It was first released in 1987 and sold only in Japan.
The initial model has a 10 MHz Motorola 68000 CPU, 1 MB of RAM, and lacks a hard drive. The final model was released in 1993 with ...
, released in 1987, used a custom graphics chipset with a 65,536 color palette and hardware support for sprites, scrolling, and multiple playfields, eventually serving as a development machine for
Capcom
is a Japanese video game developer and publisher. It has created a number of multi-million-selling game franchises, with its most commercially successful being '' Resident Evil'', '' Monster Hunter'', '' Street Fighter'', '' Mega Man'', '' ...
's
CP System
The is an arcade system board developed by Capcom that ran game software stored on removable daughterboards. More than two dozen arcade titles were released for CPS-1, before Capcom shifted game development over to its successor, the CP System ...
arcade board. Fujitsu later competed with the
FM Towns
The is a Japanese personal computer, built by Fujitsu from February 1989 to the summer of 1997. It started as a proprietary PC variant intended for multimedia applications and PC games, but later became more compatible with IBM PC compatibles. ...
computer, released in 1989 with support for a full 16,777,216 color palette. In 1988, the first dedicated
polygonal 3D graphics boards were introduced in arcades with the
Namco System 21
The Namco System 21 "Polygonizer" is an arcade system board unveiled by Namco in 1988 with the game '' Winning Run''. It was the first arcade board specifically designed for 3D polygon processing. The hardware went through significant evolution t ...
and
Taito
is a Japanese company that specializes in video games, toys, arcade cabinets and game centers, based in Shinjuku, Tokyo. The company was founded by Michael Kogan in 1953 as the importing vodka, vending machines and jukeboxes into Japan. It ...
Air System.
IBM's
proprietary Video Graphics Array
Video Graphics Array (VGA) is a video display controller and accompanying de facto graphics standard, first introduced with the IBM PS/2 line of computers in 1987, which became ubiquitous in the PC industry within three years. The term can n ...
(VGA) display standard was introduced in 1987, with a maximum resolution of 640×480 pixels. In November 1988,
NEC Home Electronics announced its creation of the
Video Electronics Standards Association
VESA (), formally known as Video Electronics Standards Association, is an American technical standards organization for computer display standards. The organization was incorporated in California in July 1989To retrieve the information, searc ...
(VESA) to develop and promote a
Super VGA
Super VGA (SVGA) is a broad term that covers a wide range of computer display standards that extended IBM's VGA specification.
When used as shorthand for a resolution, as VGA and XGA often are, SVGA refers to a resolution of 800×600.
History
...
(SVGA)
computer display standard
Computer display standards are a combination of aspect ratio, display size, display resolution, color depth, and refresh rate. They are associated with specific expansion cards, video connectors and monitors.
History
Various computer display ...
as a successor to IBM's proprietary VGA display standard. Super VGA enabled
graphics display resolution
The graphics display resolution is the width and height dimension of an electronic visual display device, measured in pixels. This information is used for electronic devices such as a computer monitor. Certain combinations of width and height ...
s up to 800×600
pixel
In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device.
In most digital display devices, pixels are the s ...
s, a 36% increase.
1990s

In 1991,
S3 Graphics
S3 Graphics, Ltd (commonly referred to as S3) was an American computer graphics company. The company sold the Trio, ViRGE, Savage 3D, and Chrome series of graphics processors. Struggling against competition from 3dfx Interactive, ATI and Nv ...
introduced the ''
S3 86C911'', which its designers named after the
Porsche 911
The Porsche 911 (pronounced ''Nine Eleven'' or in german: Neunelfer) is a two-door 2+2 high performance rear-engined sports car introduced in September 1964 by Porsche AG of Stuttgart, Germany. It has a rear-mounted flat-six engine and ori ...
as an indication of the performance increase it promised. The 86C911 spawned a host of imitators: by 1995, all major PC graphics chip makers had added
2D acceleration support to their chips. By this time, fixed-function ''Windows accelerators'' had surpassed expensive general-purpose graphics coprocessors in Windows performance, and these coprocessors faded away from the PC market.
Throughout the 1990s, 2D
GUI acceleration continued to evolve. As manufacturing capabilities improved, so did the level of integration of graphics chips. Additional
application programming interfaces (APIs) arrived for a variety of tasks, such as Microsoft's
WinG
A wing is a type of fin that produces lift while moving through air or some other fluid. Accordingly, wings have streamlined cross-sections that are subject to aerodynamic forces and act as airfoils. A wing's aerodynamic efficiency is exp ...
graphics library
A graphics library is a program library designed to aid in rendering computer graphics to a monitor. This typically involves providing optimized versions of functions that handle common rendering tasks. This can be done purely in software and run ...
for
Windows 3.x, and their later
DirectDraw
DirectDraw (ddraw.dll) is an API that used to be a part of Microsoft's DirectX API. DirectDraw is used to accelerate rendering of 2D graphics in applications. DirectDraw also allows applications to run fullscreen or embedded in a window such as m ...
interface for
hardware acceleration
Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calc ...
of 2D games within
Windows 95
Windows 95 is a consumer-oriented operating system developed by Microsoft as part of its Windows 9x family of operating systems. The first operating system in the 9x family, it is the successor to Windows 3.1x, and was released to manufactu ...
and later.
In the early- and mid-1990s,
real-time
Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
3D graphics were becoming increasingly common in arcade, computer, and console games, which led to increasing public demand for hardware-accelerated 3D graphics. Early examples of mass-market 3D graphics hardware can be found in arcade system boards such as the
Sega Model 1
Sega is a video game developer, publisher, and hardware development company headquartered in Tokyo, Japan, with multiple offices around the world. The company's involvement in the arcade game industry began as a Japan-based distributor of co ...
,
Namco System 22, and
Sega Model 2, and the
fifth-generation video game consoles such as the
Saturn
Saturn is the sixth planet from the Sun and the second-largest in the Solar System, after Jupiter. It is a gas giant with an average radius of about nine and a half times that of Earth. It has only one-eighth the average density of Earth; ...
,
PlayStation
is a video gaming brand that consists of five home video game consoles, two handhelds, a media center, and a smartphone, as well as an online service and multiple magazines. The brand is produced by Sony Interactive Entertainment, a di ...
and
Nintendo 64
The (N64) is a home video game console developed by Nintendo. The successor to the Super Nintendo Entertainment System, it was released on June 23, 1996, in Japan, on September 29, 1996, in North America, and on March 1, 1997, in Europe and ...
. Arcade systems such as the Sega Model 2 and
SGI Onyx
Onyx primarily refers to the parallel banded variety of chalcedony, a silicate mineral. Agate and onyx are both varieties of layered chalcedony that differ only in the form of the bands: agate has curved bands and onyx has parallel bands. The c ...
-based Namco Magic Edge Hornet Simulator in 1993 were capable of hardware T&L (
transform, clipping, and lighting
Transform, clipping, and lighting (T&L or TCL) is a term used in computer graphics.
Overview
Transformation is the task of producing a two-dimensional view of a three-dimensional scene. Clipping means only drawing the parts of the scene that ...
) years before appearing in consumer graphics cards. Some systems used
DSPs to accelerate transformations.
Fujitsu
is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...
, which worked on the Sega Model 2 arcade system, began working on integrating T&L into a single
LSI LSI may refer to:
Science and technology
* Large-scale integration, integrated circuits with tens of thousands of transistors
* Latent semantic indexing, a technique in natural language processing
* LSI-11, an early large-scale integration com ...
solution for use in home computers in 1995; the Fujitsu Pinolite, the first 3D geometry processor for personal computers, released in 1997. The first hardware T&L GPU on
home
A home, or domicile, is a space used as a permanent or semi-permanent residence for one or many humans, and sometimes various companion animals. It is a fully or semi sheltered space and can have both interior and exterior aspects to it ...
video game console
A video game console is an electronic device that outputs a video signal or image to display a video game that can be played with a game controller. These may be home consoles, which are generally placed in a permanent location connected to ...
s was the
Nintendo 64
The (N64) is a home video game console developed by Nintendo. The successor to the Super Nintendo Entertainment System, it was released on June 23, 1996, in Japan, on September 29, 1996, in North America, and on March 1, 1997, in Europe and ...
's
Reality Coprocessor, released in 1996. In 1997,
Mitsubishi
The is a group of autonomous Japanese multinational companies in a variety of industries.
Founded by Yatarō Iwasaki in 1870, the Mitsubishi Group historically descended from the Mitsubishi zaibatsu, a unified company which existed from 187 ...
released the
3Dpro/2MP, a fully featured GPU capable of transformation and lighting, for
workstation
A workstation is a special computer designed for technical or scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating systems. The term ''worksta ...
s and
Windows NT
Windows NT is a proprietary graphical operating system produced by Microsoft, the first version of which was released on July 27, 1993. It is a processor-independent, multiprocessing and multi-user operating system.
The first version of Wi ...
desktops;
ATi utilized it for their
FireGL 4000 graphics card
A graphics card (also called a video card, display card, graphics adapter, VGA card/VGA, video adapter, display adapter, or mistakenly GPU) is an expansion card which generates a feed of output images to a display device, such as a computer mo ...
, released in 1997.
The term "GPU" was coined by
Sony
, commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
in reference to the 32-bit
Sony GPU (designed by
Toshiba
, commonly known as Toshiba and stylized as TOSHIBA, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. Its diversified products and services include power, industrial and social infrastructure systems ...
) in the
PlayStation
is a video gaming brand that consists of five home video game consoles, two handhelds, a media center, and a smartphone, as well as an online service and multiple magazines. The brand is produced by Sony Interactive Entertainment, a di ...
video game console, released in 1994.
In the PC world, notable failed first tries for low-cost 3D graphics chips were the
S3 ''
ViRGE
The S3 ViRGE (Video and Rendering Graphics Engine) graphics chipset was one of the first 2D/ 3D accelerators designed for the mass market.
Introduced in 1995 by then graphics powerhouse S3, Inc., the ViRGE was S3's first foray into 3D-graphics. ...
'',
ATI Rage, and
Matrox
Matrox Graphics, Inc. is a producer of video card components and equipment for personal computers and workstations. Based in Dorval, Quebec, Canada, it was founded in 1976 by Lorne Trottier and Branko Matić. The name is derived from "Ma" in ...
''Mystique''. These chips were essentially previous-generation 2D accelerators with 3D features bolted on. Many were even
pin-compatible
In electronics, pin-compatible devices are electronic components, generally integrated circuits or expansion cards, sharing a common footprint and with the same functions assigned or usable on the same pins. Pin compatibility is a property de ...
with the earlier-generation chips for ease of implementation and minimal cost. Initially, performance 3D graphics were possible only with discrete boards dedicated to accelerating 3D functions (and lacking 2D GUI acceleration entirely) such as the
PowerVR
PowerVR is a division of Imagination Technologies (formerly VideoLogic) that develops hardware and software for 2D and 3D rendering, and for video encoding, decoding, associated image processing and DirectX, OpenGL ES, OpenVG, and OpenCL accele ...
and the
3dfx
3dfx Interactive was an American technology company headquartered in San Jose, California, founded in 1994, that specialized in the manufacturing of 3D graphics processing units, and later, video cards. It was a pioneer in the field from the ...
''Voodoo''. However, as manufacturing technology continued to progress, video, 2D GUI acceleration and 3D functionality were all integrated into one chip.
Rendition's ''Verite'' chipsets were among the first to do this well enough to be worthy of note. In 1997, Rendition went a step further by collaborating with
Hercules
Hercules (, ) is the Roman equivalent of the Greek divine hero Heracles, son of Jupiter and the mortal Alcmena. In classical mythology, Hercules is famous for his strength and for his numerous far-ranging adventures.
The Romans adapted th ...
and Fujitsu on a "Thriller Conspiracy" project which combined a Fujitsu FXG-1 Pinolite geometry processor with a Vérité V2200 core to create a graphics card with a full T&L engine years before Nvidia's
GeForce 256
The GeForce 256 is the original release in Nvidia's " GeForce" product-line. Announced on August 31, 1999 and released on October 11, 1999, the GeForce 256 improves on its predecessor ( RIVA TNT2) by increasing the number of fixed pixel pipeli ...
. This card, designed to reduce the load placed upon the system's CPU, never made it to market.
OpenGL
OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve ha ...
appeared in the early '90s as a professional graphics API, but originally suffered from performance issues which allowed the
Glide API to step in and become a dominant force on the PC in the late '90s.
[ 3dfx Glide API] However, these issues were quickly overcome and the Glide API fell by the wayside. Software implementations of OpenGL were common during this time, although the influence of OpenGL eventually led to widespread hardware support. Over time, a parity emerged between features offered in hardware and those offered in OpenGL.
DirectX
Microsoft DirectX is a collection of application programming interfaces (APIs) for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Originally, the names of these APIs all began with "Direc ...
became popular among
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
game developers during the late 90s. Unlike OpenGL, Microsoft insisted on providing strict one-to-one support of hardware. The approach made DirectX less popular as a standalone graphics API initially, since many GPUs provided their own specific features, which existing OpenGL applications were already able to benefit from, leaving DirectX often one generation behind. (See:
Comparison of OpenGL and Direct3D
Direct3D and OpenGL are competing application programming interfaces (APIs) which can be used in applications to render 2D and 3D computer graphics. , graphics processing units (GPUs) almost always implement one version of both of these APIs. Exa ...
.)
Over time, Microsoft began to work more closely with hardware developers and started to target the releases of DirectX to coincide with those of the supporting graphics hardware.
Direct3D
Direct3D is a graphics application programming interface (API) for Microsoft Windows. Part of DirectX, Direct3D is used to render three-dimensional graphics in applications where performance is important, such as games. Direct3D uses hardware ...
5.0 was the first version of the burgeoning API to gain widespread adoption in the gaming market, and it competed directly with many more-hardware-specific, often proprietary graphics libraries, while OpenGL maintained a strong following. Direct3D 7.0 introduced support for hardware-accelerated
transform and lighting
Transform, clipping, and lighting (T&L or TCL) is a term used in computer graphics.
Overview
Transformation is the task of producing a two-dimensional view of a 3D computer graphics, three-dimensional scene. Clipping (computer graphics), Clipp ...
(T&L) for Direct3D, while OpenGL had this capability already exposed from its inception. 3D accelerator cards moved beyond being just simple
rasterizers to add another significant hardware stage to the 3D rendering pipeline. The
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
''
GeForce 256
The GeForce 256 is the original release in Nvidia's " GeForce" product-line. Announced on August 31, 1999 and released on October 11, 1999, the GeForce 256 improves on its predecessor ( RIVA TNT2) by increasing the number of fixed pixel pipeli ...
'' (also known as NV10) was the first consumer-level card released on the market with hardware-accelerated T&L, while professional 3D cards already had this capability. Hardware transform and lighting, both already existing features of OpenGL, came to consumer-level hardware in the '90s and set the precedent for later
pixel shader
In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene - a process known as ''shading''. Shaders have evolved to perform a variety of speci ...
and
vertex shader
In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene - a process known as ''shading''. Shaders have evolved to perform a variety of speci ...
units which were far more flexible and programmable.
2000 to 2010
Nvidia was first to produce a chip capable of programmable
shading
Shading refers to the depiction of depth perception in 3D models (within the field of 3D computer graphics) or illustrations (in visual art) by varying the level of darkness. Shading tries to approximate local behavior of light on the object ...
; the ''
GeForce 3
The GeForce 3 series (NV20) is the third generation of Nvidia's GeForce graphics processing units (GPUs). Introduced in February 2001, it advanced the GeForce architecture by adding programmable pixel and vertex shaders, multisample anti-alia ...
'' (code named NV20). Each pixel could now be processed by a short "program" that could include additional image textures as inputs, and each geometric vertex could likewise be processed by a short program before it was projected onto the screen. Used in the
Xbox
Xbox is a video gaming brand created and owned by Microsoft. The brand consists of five video game consoles, as well as applications (games), streaming services, an online service by the name of Xbox network, and the development arm by th ...
console, it competed with the
PlayStation 2, which used a custom vector unit for hardware accelerated vertex processing (commonly referred to as VU0/VU1). The earliest incarnations of shader execution engines used in
Xbox
Xbox is a video gaming brand created and owned by Microsoft. The brand consists of five video game consoles, as well as applications (games), streaming services, an online service by the name of Xbox network, and the development arm by th ...
were not general purpose and could not execute arbitrary pixel code. Vertices and pixels were processed by different units which had their own resources with pixel shaders having much tighter constraints (being as they are executed at much higher frequencies than with vertices). Pixel shading engines were actually more akin to a highly customizable function block and didn't really "run" a program. Many of these disparities between vertex and pixel shading were not addressed until much later with the
Unified Shader Model
In the field of 3D computer graphics, the unified shader model (known in Direct3D 10 as " Shader Model 4.0") refers to a form of shader hardware in a graphical processing unit (GPU) where all of the shader stages in the rendering pipeline (geome ...
.
By October 2002, with the introduction of the
ATI ''
Radeon 9700
The R300 GPU, introduced in August 2002 and developed by ATI Technologies, is its third generation of GPU used in ''Radeon'' graphics cards. This GPU features 3D acceleration based upon Direct3D 9.0 and OpenGL 2.0, a major improvement in feat ...
'' (also known as R300), the world's first
Direct3D
Direct3D is a graphics application programming interface (API) for Microsoft Windows. Part of DirectX, Direct3D is used to render three-dimensional graphics in applications where performance is important, such as games. Direct3D uses hardware ...
9.0 accelerator, pixel and vertex shaders could implement
looping and lengthy
floating point
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be r ...
math, and were quickly becoming as flexible as CPUs, yet orders of magnitude faster for image-array operations. Pixel shading is often used for
bump mapping
Bump mapping is a texture mapping technique in computer graphics for simulating bumps and wrinkles on the surface of an object. This is achieved by perturbing the surface normals of the object and using the perturbed normal during lighting calcu ...
, which adds texture, to make an object look shiny, dull, rough, or even round or extruded.
With the introduction of the Nvidia
GeForce 8 series, and then new generic stream processing unit GPUs became a more generalized computing devices. Today,
parallel
Parallel is a geometric term of location which may refer to:
Computing
* Parallel algorithm
* Parallel computing
* Parallel metaheuristic
* Parallel (software), a UNIX utility for running programs in parallel
* Parallel Sysplex, a cluster of I ...
GPUs have begun making computational inroads against the CPU, and a subfield of research, dubbed GPU Computing or
GPGPU
General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditiona ...
for ''General Purpose Computing on GPU'', has found its way into fields as diverse as
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
,
oil exploration
Hydrocarbon exploration (or oil and gas exploration) is the search by petroleum geologists and geophysicists for deposits of hydrocarbons, particularly petroleum and natural gas, in the Earth
Earth is the third planet from the Sun ...
, scientific
image processing
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimension ...
,
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as:
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as:
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matric ...
,
statistics,
3D reconstruction
In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects.
This process can be accomplished either by active or passive methods. If the model is allowed to change its shape i ...
and even
stock options
In finance, an option is a contract which conveys to its owner, the ''holder'', the right, but not the obligation, to buy or sell a specific quantity of an underlying asset or instrument at a specified strike price on or before a specified d ...
pricing determination.
GPGPU
General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditiona ...
at the time was the precursor to what is now called a compute shader (e.g. CUDA, OpenCL, DirectCompute) and actually abused the hardware to a degree by treating the data passed to algorithms as texture maps and executing algorithms by drawing a triangle or quad with an appropriate pixel shader. This obviously entails some overheads since units like the
Scan Converter are involved where they aren't really needed (nor are triangle manipulations even a concern—except to invoke the pixel shader).
Nvidia's
CUDA
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
platform, first introduced in 2007, was the earliest widely adopted programming model for GPU computing. More recently
OpenCL
OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-prog ...
has become broadly supported. OpenCL is an open standard defined by the Khronos Group which allows for the development of code for both GPUs and CPUs with an emphasis on portability. OpenCL solutions are supported by Intel, AMD, Nvidia, and ARM, and according to a recent report by Evan's Data, OpenCL is the GPGPU development platform most widely used by developers in both the US and Asia Pacific.
2010 to present
In 2010, Nvidia began a partnership with
Audi
Audi AG () is a German automotive manufacturer of luxury vehicles headquartered in Ingolstadt, Bavaria, Germany. As a subsidiary of its parent company, the Volkswagen Group, Audi produces vehicles in nine production facilities worldwide.
The o ...
to power their cars' dashboards, using the
Tegra
Tegra is a system on a chip (SoC) series developed by Nvidia for mobile devices such as smartphones, personal digital assistants, and mobile Internet devices. The Tegra integrates an ARM architecture central processing unit (CPU), graphics proc ...
GPUs to provide increased functionality to cars' navigation and entertainment systems. Advances in GPU technology in cars has helped push
self-driving technology. AMD's
Radeon HD 6000 Series
The Northern Islands series is a family of GPUs developed by Advanced Micro Devices (AMD) forming part of its Radeon-brand, based on the 40 nm process. Some models are based on TeraScale 2 (VLIW5), some on the new TeraScale 3 (VLIW4) intr ...
cards were released in 2010 and in 2011, AMD released their 6000M Series discrete GPUs to be used in mobile devices. The Kepler line of graphics cards by Nvidia came out in 2012 and were used in the Nvidia's 600 and 700 series cards. A feature in this new GPU microarchitecture included GPU boost, a technology that adjusts the clock-speed of a video card to increase or decrease it according to its power draw. The
Kepler microarchitecture was manufactured on the 28 nm process.
The
PS4 and
Xbox One
The Xbox One is a home video game console developed by Microsoft. Announced in May 2013, it is the successor to Xbox 360 and the third base console in the Xbox series of video game consoles. It was first released in North America, parts of ...
were released in 2013, they both use GPUs based on
AMD's Radeon HD 7850 and 7790. Nvidia's Kepler line of GPUs was followed by the
Maxwell
Maxwell may refer to:
People
* Maxwell (surname), including a list of people and fictional characters with the name
** James Clerk Maxwell, mathematician and physicist
* Justice Maxwell (disambiguation)
* Maxwell baronets, in the Baronetage o ...
line, manufactured on the same process. 28 nm chips by Nvidia were manufactured by TSMC, the Taiwan Semiconductor Manufacturing Company, that was manufacturing using the 28 nm process at the time. Compared to the 40 nm technology from the past, this new manufacturing process allowed a 20 percent boost in performance while drawing less power.
Virtual reality
Virtual reality (VR) is a simulated experience that employs pose tracking and 3D near-eye displays to give the user an immersive feel of a virtual world. Applications of virtual reality include entertainment (particularly video games), e ...
headsets have very high system requirements. VR headset manufacturers recommended the GTX 970 and the R9 290X or better at the time of their release.
Pascal is the next generation of consumer graphics cards by Nvidia released in 2016. The
GeForce 10 series
The GeForce 10 series is a series of graphics processing units developed by Nvidia, initially based on the Pascal microarchitecture announced in March 2014. This design series succeeded the GeForce 900 series, and is succeeded by the GeForce 16 ...
of cards are under this generation of graphics cards. They are made using the 16 nm manufacturing process which improves upon previous microarchitectures. Nvidia has released one non-consumer card under the new
Volta
Volta may refer to:
Persons
* Alessandro Volta (1745–1827), Italian physicist and inventor of the electric battery, count and eponym of the volt
* Giovanni Volta (1928–2012), Italian Roman Catholic bishop
* Giovanni Serafino Volta (1764–184 ...
architecture, the Titan V. Changes from the Titan XP, Pascal's high-end card, include an increase in the number of CUDA cores, the addition of tensor cores, and
HBM2
High Bandwidth Memory (HBM) is a high-speed computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerator ...
. Tensor cores are cores specially designed for deep learning, while high-bandwidth memory is on-die, stacked, lower-clocked memory that offers an extremely wide memory bus that is useful for the Titan V's intended purpose. To emphasize that the Titan V is not a gaming card, Nvidia removed the "GeForce GTX" suffix it adds to consumer gaming cards.
On August 20, 2018, Nvidia launched the RTX 20 series GPUs that add ray-tracing cores to GPUs, improving their performance on lighting effects.
Polaris 11
The Radeon 400 series is a series of graphics processors developed by AMD. These cards were the first to feature the Polaris GPUs, using the new 14 nm FinFET manufacturing process, developed by Samsung Electronics and licensed to GlobalFoun ...
and
Polaris 10
The Radeon 400 series is a series of graphics processors developed by AMD. These cards were the first to feature the Polaris GPUs, using the new 14 nm FinFET manufacturing process, developed by Samsung Electronics and licensed to GlobalFoun ...
GPUs from AMD are fabricated by a 14-nanometer process. Their release results in a substantial increase in the performance per watt of AMD video cards. AMD has also released the Vega GPUs series for the high end market as a competitor to Nvidia's high end Pascal cards, also featuring HBM2 like the Titan V.
In 2019, AMD released the successor to their
Graphics Core Next
Graphics Core Next (GCN) is the codename for a series of microarchitectures and an instruction set architecture that were developed by AMD for its GPUs as the successor to its TeraScale microarchitecture. The first product featuring GCN was la ...
(GCN) microarchitecture/instruction set. Dubbed as RDNA, the first product lineup featuring the first generation of RDNA was the
Radeon RX 5000 series
The Radeon RX 5000 series is a series of graphics processors developed by AMD, based on their RDNA architecture. The series is targeting the mainstream mid to high-end segment and is the successor to the Radeon RX Vega series. The launch occur ...
of video cards, which later launched on July 7, 2019.
[AMD press release: AMD.com. Retrieved October 5th, 2019] Later, the company announced that the successor to the RDNA microarchitecture would be a refresh. Dubbed as RDNA 2, the new microarchitecture was reportedly scheduled for release in Q4 2020.
AMD unveiled the
Radeon RX 6000 series
The Radeon RX 6000 series is a series of graphics processing units developed by AMD, based on their RDNA 2 architecture. It was announced on October 28, 2020 and is the successor to the Radeon RX 5000 series. It consists of the RX 6400, RX 65 ...
, its next-gen RDNA 2 graphics cards with support for hardware-accelerated ray tracing at an online event on October 28, 2020. The lineup initially consists of the RX 6800, RX 6800 XT and RX 6900 XT. The RX 6800 and 6800 XT launched on November 18, 2020, with the RX 6900 XT being released on December 8, 2020.
The RX 6700 XT, which is based on Navi 22, was launched on March 18, 2021.
The
PlayStation 5
The PlayStation 5 (PS5) is a home video game console developed by Sony Interactive Entertainment. Announced as the successor to the PlayStation 4 in April 2019, it was launched on November 12, 2020, in Australia, Japan, New Zealand, North A ...
and
Xbox Series X and Series S
The Xbox Series X/S are home video game consoles developed by Microsoft. They were both released on November 10, 2020, as the fourth generation Xbox, succeeding the Xbox One. Along with Sony's PlayStation 5, also released in November 2020 ...
were released in 2020, they both use GPUs based on the
RDNA 2
RDNA (Radeon DNA) is a graphics processing unit (GPU) microarchitecture and accompanying instruction set architecture developed by Advanced Micro Devices (AMD). It is the successor to their Graphics Core Next (GCN) microarchitecture/instructi ...
microarchitecture with proprietary tweaks and different GPU configurations in each system's implementation.
GPU companies
Many companies have produced GPUs under a number of brand names. In 2009,
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the devel ...
,
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
and
AMD/
ATI were the market share leaders, with 49.4%, 27.8% and 20.6% market share respectively. However, those numbers include Intel's integrated graphics solutions as GPUs. Not counting those,
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
and
AMD control nearly 100% of the market as of 2018. Their respective market shares are 66% and 33%. In addition,
Matrox
Matrox Graphics, Inc. is a producer of video card components and equipment for personal computers and workstations. Based in Dorval, Quebec, Canada, it was founded in 1976 by Lorne Trottier and Branko Matić. The name is derived from "Ma" in ...
produces GPUs.
Modern smartphones also use mostly
Adreno
Adreno is a series of graphics processing unit (GPU) semiconductor intellectual property cores developed by Qualcomm and used in many of their SoCs.
History
Adreno (an anagram of AMD's graphic card brand ''Radeon''), was originally developed ...
GPUs from
Qualcomm,
PowerVR
PowerVR is a division of Imagination Technologies (formerly VideoLogic) that develops hardware and software for 2D and 3D rendering, and for video encoding, decoding, associated image processing and DirectX, OpenGL ES, OpenVG, and OpenCL accele ...
GPUs from
Imagination Technologies
Imagination Technologies Limited is a British semiconductor and software design company owned by Canyon Bridge Capital Partners, a private equity fund based in Beijing that is ultimately owned by the Chinese government. With its global headqua ...
and
Mali GPUs from
ARM.
Computational functions
Modern GPUs use most of their
transistor
upright=1.4, gate (G), body (B), source (S) and drain (D) terminals. The gate is separated from the body by an insulating layer (pink).
A transistor is a semiconductor device used to Electronic amplifier, amplify or electronic switch, switch ...
s to do calculations related to
3D computer graphics
3D computer graphics, or “3D graphics,” sometimes called CGI, 3D-CGI or three-dimensional computer graphics are graphics that use a three-dimensional representation of geometric data (often Cartesian) that is stored in the computer for t ...
. In addition to the 3D hardware, today's GPUs include basic 2D acceleration and
framebuffer
A framebuffer (frame buffer, or sometimes framestore) is a portion of random-access memory (RAM) containing a bitmap that drives a video display. It is a memory buffer containing data representing all the pixels in a complete video frame. Moder ...
capabilities (usually with a VGA compatibility mode). Newer cards such as AMD/ATI HD5000-HD7000 even lack dedicated 2D acceleration; it has to be emulated by 3D hardware. GPUs were initially used to accelerate the memory-intensive work of
texture mapping
Texture mapping is a method for mapping a texture on a computer-generated graphic. Texture here can be high frequency detail, surface texture, or color.
History
The original technique was pioneered by Edwin Catmull in 1974.
Texture mappi ...
and
rendering polygons, later adding units to accelerate
geometric
Geometry (; ) is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is ca ...
calculations such as the
rotation
Rotation, or spin, is the circular movement of an object around a '' central axis''. A two-dimensional rotating object has only one possible central axis and can rotate in either a clockwise or counterclockwise direction. A three-dimensional ...
and
translation
Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
of vertex (geometry), vertices into different coordinate systems. Recent developments in GPUs include support for programmable shaders which can manipulate vertices and textures with many of the same operations supported by Central processing unit, CPUs, oversampling and interpolation techniques to reduce aliasing, and very high-precision color spaces. Given that most of these computations involve Matrix (mathematics), matrix and Vector calculus, vector operations, engineers and scientists have increasingly studied the use of GPUs for non-graphical calculations; they are especially suited to other embarrassingly parallel problems.
Several factors of the GPU's construction enter into the performance of the card for real-time rendering. Common factors can include the size of the connector pathways in the semiconductor device fabrication, the clock signal frequency, and the number and size of various on-chip memory CPU cache, caches. Additionally, the number of Streaming Multiprocessors (SM) for NVidia GPUs, or Compute Units (CU) for AMD GPUs, which describe the number of core on-silicon processor units within the GPU chip that perform the core calculations, typically working in parallel with other SM/CUs on the GPU. Performance of GPUs are typically measured in floating point operations per second or FLOPS, with GPUs in the 2010s and 2020s typically delivering performance measured in teraflops (TFLOPS). This is an estimated performance measure as other factors can impact the actual display rate.
With the emergence of deep learning, the importance of GPUs has increased. In research done by Indigo, it was found that while training deep learning neural networks, GPUs can be 250 times faster than CPUs. There has been some level of competition in this area with Application-specific integrated circuit, ASICs, most prominently the Tensor Processing Unit (TPU) made by Google. However, ASICs require changes to existing code and GPUs are still very popular.
GPU accelerated video decoding and encoding
Most GPUs made since 1995 support the YUV color space and hardware overlays, important for digital video playback, and many GPUs made since 2000 also support MPEG primitives such as motion compensation and inverse discrete cosine transform, iDCT. This process of hardware accelerated video decoding, where portions of the video decoding process and video post-processing are offloaded to the GPU hardware, is commonly referred to as "GPU accelerated video decoding", "GPU assisted video decoding", "GPU hardware accelerated video decoding" or "GPU hardware assisted video decoding".
More recent graphics cards even decode high-definition video on the card, offloading the central processing unit. The most common APIs for GPU accelerated video decoding are DirectX Video Acceleration, DxVA for Microsoft Windows operating system and VDPAU, vaAPI, VAAPI, X-Video Motion Compensation, XvMC, and X-Video Bitstream Acceleration, XvBA for Linux-based and UNIX-like operating systems. All except XvMC are capable of decoding videos encoded with MPEG-1, MPEG-2, MPEG-4 Part 2, MPEG-4 ASP (MPEG-4 Part 2), MPEG-4 AVC (H.264 / DivX 6), VC-1, WMV3/WMV9, Xvid / OpenDivX (DivX 4), and DivX 5 codecs, while XvMC is only capable of decoding MPEG-1 and MPEG-2.
There are several :Video compression and decompression ASIC, dedicated hardware video decoding and encoding solutions.
Video decoding processes that can be accelerated
The video decoding processes that can be accelerated by today's modern GPU hardware are:
* Motion compensation, Motion compensation (mocomp)
* Inverse discrete cosine transform, Inverse discrete cosine transform (iDCT)
** Inverse telecine 3:2 and 2:2 pull-down correction
* Inverse modified discrete cosine transform (iMDCT)
* In-loop deblocking filter (video), deblocking filter
* Intra-frame prediction
* Inverse Quantization (image processing), quantization (IQ)
* Huffman coding, Variable-length decoding (VLD), more commonly known as slice-level acceleration
* Spatial-temporal deinterlacing and automatic Interlaced video, interlace/progressive scan, progressive source detection
* Bitstream processing (Context-adaptive variable-length coding/Context-adaptive binary arithmetic coding) and perfect pixel positioning.
The above operations also have applications in video editing, encoding and transcoding
GPU forms
Terminology
In personal computers, there are two main forms of GPUs. Each has many synonyms:
* ''#Dedicated graphics cards, Dedicated graphics card'' - also called ''discrete''.
* ''#Integrated graphics, Integrated graphics'' - also called: ''shared graphics solutions'', ''integrated graphics processors'' (IGP), or ''unified memory architecture'' (UMA).
Usage specific GPU
Most GPUs are designed for a specific usage, real-time 3D graphics or other mass calculations:
# Gaming
#*GeForce, GeForce GTX, RTX
#* Nvidia Titan
#* Radeon, Radeon HD, R5, R7, R9, RX, Vega and Navi series
#* Radeon VII
# Cloud Gaming
#*Nvidia GRID
#* Radeon Sky
# Workstation
#* Nvidia Quadro
#* Nvidia RTX
#* AMD FirePro
#*Radeon Pro, AMD Radeon Pro
#*Intel Arc Pro
# Cloud Workstation
#*Nvidia Tesla
#* AMD FireStream
# Artificial Intelligence training and Cloud
#*Nvidia Tesla
#* Radeon Instinct, AMD Radeon Instinct
# Automated/Driverless car
#* Nvidia Drive PX-series, Drive PX
Dedicated graphics cards
The GPUs of the most powerful class typically interface with the
motherboard
A motherboard (also called mainboard, main circuit board, mb, mboard, backplane board, base board, system board, logic board (only in Apple computers) or mobo) is the main printed circuit board (PCB) in general-purpose computers and other expand ...
by means of an expansion slot such as PCI Express (PCIe) or Accelerated Graphics Port (AGP) and can usually be replaced or upgraded with relative ease, assuming the motherboard is capable of supporting the upgrade. A few graphics cards still use Peripheral Component Interconnect (PCI) slots, but their bandwidth is so limited that they are generally used only when a PCIe or AGP slot is not available.
A dedicated GPU is not necessarily removable, nor does it necessarily interface with the motherboard in a standard fashion. The term "dedicated" refers to the fact that dedicated graphics cards have random-access memory, RAM that is dedicated to the card's use, not to the fact that ''most'' dedicated GPUs are removable. Further, this RAM is usually specially selected for the expected serial workload of the graphics card (see GDDR SDRAM, GDDR). Sometimes, systems with dedicated, ''discrete'' GPUs were called "DIS" systems, as opposed to "UMA" systems (see next section). Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. Such ports may still be considered PCIe or AGP in terms of their logical host interface, even if they are not physically interchangeable with their counterparts.
Technologies such as Scalable Link Interface, SLI and NVLink by Nvidia and ATI CrossFire, CrossFire by AMD allow multiple GPUs to draw images simultaneously for a single screen, increasing the processing power available for graphics. These technologies, however, are increasingly uncommon, as most games do not fully utilize multiple GPUs, as most users cannot afford them. Multiple GPUs are still used on supercomputers (like in Summit (supercomputer), Summit), on workstations to accelerate video (processing multiple videos at once) and 3D rendering, for VFX and for simulations, and in AI to expedite training, as is the case with Nvidia's lineup of DGX workstations and servers and Tesla GPUs and Intel's upcoming Ponte Vecchio GPUs.
Integrated graphics processing unit

''Integrated graphics processing unit'' (IGPU), ''Integrated graphics'', ''shared graphics solutions'', ''integrated graphics processors'' (IGP) or ''unified memory architecture'' (UMA) utilize a portion of a computer's system RAM rather than dedicated graphics memory. IGPs can be integrated onto the motherboard as part of the (northbridge) chipset, or on the same die (integrated circuit) with the CPU (like AMD APU or Intel HD Graphics). On certain motherboards, AMD's IGPs can use dedicated sideport memory. This is a separate fixed block of high performance memory that is dedicated for use by the GPU. In early 2007, computers with integrated graphics account for about 90% of all PC shipments. They are less costly to implement than dedicated graphics processing, but tend to be less capable. Historically, integrated processing was considered unfit to play 3D games or run graphically intensive programs but could run less intensive programs such as Adobe Flash. Examples of such IGPs would be offerings from SiS and VIA circa 2004. However, modern integrated graphics processors such as AMD Accelerated Processing Unit and Intel Graphics Technology (HD, UHD, Iris, Iris Pro, Iris Plus, and Intel Xe#Xe-LP (Low Power), Xe-LP) are more than capable of handling 2D graphics or low stress 3D graphics.
Since the GPU computations are extremely memory-intensive, integrated processing may find itself competing with the CPU for the relatively slow system RAM, as it has minimal or no dedicated video memory. IGPs can have up to 29.856 GB/s of memory bandwidth from system RAM, whereas a graphics card may have up to 264 GB/s of bandwidth between its random-access memory, RAM and GPU core. This memory bus bandwidth can limit the performance of the GPU, though Multi-channel memory architecture, multi-channel memory can mitigate this deficiency.
Older integrated graphics chipsets lacked hardware Transform, clipping, and lighting, transform and lighting, but newer ones include it.
Hybrid graphics processing
This newer class of GPUs competes with integrated graphics in the low-end desktop and notebook markets. The most common implementations of this are ATI's HyperMemory and Nvidia's TurboCache.
Hybrid graphics cards are somewhat more expensive than integrated graphics, but much less expensive than dedicated graphics cards. These share memory with the system and have a small dedicated memory cache, to make up for the high Memory latency, latency of the system RAM. Technologies within PCI Express can make this possible. While these solutions are sometimes advertised as having as much as 768 MB of RAM, this refers to how much can be shared with the system memory.
Stream processing and general purpose GPUs (GPGPU)
It is becoming increasingly common to use a GPGPU, general purpose graphics processing unit (GPGPU) as a modified form of stream processing, stream processor (or a vector processor), running compute kernels. This concept turns the massive computational power of a modern graphics accelerator's shader pipeline into general-purpose computing power, as opposed to being hardwired solely to do graphical operations. In certain applications requiring massive vector operations, this can yield several orders of magnitude higher performance than a conventional CPU. The two largest discrete (see "#Dedicated graphics cards, Dedicated graphics cards" above) GPU designers, AMD and
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
, are beginning to pursue this approach with an array of applications. Both Nvidia and AMD have teamed with Stanford University to create a GPU-based client for the Folding@home distributed computing project, for protein folding calculations. In certain circumstances, the GPU calculates forty times faster than the CPUs traditionally used by such applications.
GPGPU can be used for many types of embarrassingly parallel tasks including ray tracing (graphics), ray tracing. They are generally suited to high-throughput type computations that exhibit data-parallelism to exploit the wide vector width SIMD architecture of the GPU.
Furthermore, GPU-based high performance computers are starting to play a significant role in large-scale modelling. Three of the 10 most powerful supercomputers in the world take advantage of GPU acceleration.
GPUs support API extensions to the C (programming language), C programming language such as
OpenCL
OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-prog ...
and OpenMP. Furthermore, each GPU vendor introduced its own API which only works with their cards, AMD APP SDK and
CUDA
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
from AMD and Nvidia, respectively. These technologies allow specified functions called compute kernels from a normal C program to run on the GPU's stream processors. This makes it possible for C programs to take advantage of a GPU's ability to operate on large buffers in parallel, while still using the CPU when appropriate. CUDA is also the first API to allow CPU-based applications to directly access the resources of a GPU for more general purpose computing without the limitations of using a graphics API.
Since 2005 there has been interest in using the performance offered by GPUs for evolutionary computation in general, and for accelerating the Fitness (genetic algorithm), fitness evaluation in genetic programming in particular. Most approaches compile linear genetic programming, linear or genetic programming, tree programs on the host PC and transfer the executable to the GPU to be run. Typically the performance advantage is only obtained by running the single active program simultaneously on many example problems in parallel, using the GPU's SIMD architecture. However, substantial acceleration can also be obtained by not compiling the programs, and instead transferring them to the GPU, to be interpreted there. Acceleration can then be obtained by either interpreting multiple programs simultaneously, simultaneously running multiple example problems, or combinations of both. A modern GPU can readily simultaneously interpret hundreds of thousands of very small programs.
Some modern workstation GPUs, such as the Nvidia Quadro workstation cards using the Volta and Turing architectures, feature dedicating processing cores for tensor-based deep learning applications. In Nvidia's current series of GPUs these cores are called Tensor Cores. These GPUs usually have significant FLOPS performance increases, utilizing 4x4 matrix multiplication and division, resulting in hardware performance up to 128 TFLOPS in some applications. These tensor cores are also supposed to appear in consumer cards running the Turing architecture, and possibly in the Navi series of consumer cards from AMD.
External GPU (eGPU)
An external GPU is a graphics processor located outside of the housing of the computer, similar to a large external hard drive. External graphics processors are sometimes used with laptop computers. Laptops might have a substantial amount of RAM and a sufficiently powerful central processing unit (CPU), but often lack a powerful graphics processor, and instead have a less powerful but more energy-efficient on-board graphics chip. On-board graphics chips are often not powerful enough for playing video games, or for other graphically intensive tasks, such as editing video or 3D animation/rendering.
Therefore, it is desirable to be able to attach a GPU to some external bus of a notebook. PCI Express is the only bus used for this purpose. The port may be, for example, an ExpressCard or PCI Express#PCI Express Mini Card, mPCIe port (PCIe ×1, up to 5 or 2.5 Gbit/s respectively) or a Thunderbolt (interface), Thunderbolt 1, 2, or 3 port (PCIe ×4, up to 10, 20, or 40 Gbit/s respectively). Those ports are only available on certain notebook systems. eGPU enclosures include their own power supply (PSU), because powerful GPUs can easily consume hundreds of watts.
Official vendor support for external GPUs has gained traction recently. One notable milestone was Apple's decision to officially support external GPUs with MacOS High Sierra 10.13.4. There are also several major hardware vendors (HP, Alienware, Razer) releasing Thunderbolt 3 eGPU enclosures. This support has continued to fuel eGPU implementations by enthusiasts.
Sales
In 2013, 438.3 million GPUs were shipped globally and the forecast for 2014 was 414.2 million.
See also
* Texture mapping unit (TMU)
* Render output unit (ROP)
* Brute force attack
* Computer hardware
* Computer monitor
* GPU cache
* GPU virtualization
* Manycore processor
* Physics processing unit (PPU)
* Tensor processing unit (TPU)
* Ray-tracing hardware
* Software rendering
* Vision processing unit (VPU)
* Vector processor
* Video card
* Video display controller
* Video game console
* AI accelerator
* Vector processor#GPU vector processing features, GPU Vector Processor internal features
Hardware
* List of AMD graphics processing units
* List of Nvidia graphics processing units
* List of Intel graphics processing units
* Intel GMA
* Larrabee (microarchitecture), Larrabee
* Nvidia PureVideo - the bit-stream technology from
Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
used in their graphics chips to accelerate video decoding on hardware GPU with DXVA.
* System on a chip, SoC
* Unified Video Decoder, UVD (Unified Video Decoder) – the video decoding bit-stream technology from ATI to support hardware (GPU) decode with DXVA
APIs
* OpenGL, OpenGL API
* DirectX Video Acceleration, DirectX Video Acceleration (DxVA) API for Microsoft Windows operating-system.
* Mantle (API)
* Vulkan (API)
* Video Acceleration API, Video Acceleration API (VA API)
* VDPAU, VDPAU (Video Decode and Presentation API for Unix)
* X-Video Bitstream Acceleration, X-Video Bitstream Acceleration (XvBA), the X11 equivalent of DXVA for MPEG-2, H.264, and VC-1
* X-Video Motion Compensation – the X11 equivalent for MPEG-2 video codec only
Applications
* GPU cluster
* Mathematica – includes built-in support for CUDA and OpenCL GPU execution
* Molecular modeling on GPU
* Deeplearning4j – open-source, distributed deep learning for Java
References
External links
NVIDIA - What is GPU computing?* Th
''GPU Gems'' book series
How GPUs workGPU Caps Viewer - Video card information utilityOpenGPU-GPU Architecture(In Chinese)ARM Mali GPUs Overview
{{DEFAULTSORT:Graphics Processing Unit
GPGPU
Graphics hardware
Graphics processing units,
Virtual reality
OpenCL compute devices
Artificial intelligence
Application-specific integrated circuits
Hardware acceleration
Digital electronics
Electronic design
Electronic design automation