A neural processing unit (NPU), also known as AI accelerator or deep learning processor, is a class of specialized
hardware accelerator or computer system designed to accelerate
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
(AI) and
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
applications, including
artificial neural network
In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks.
A neural network consists of connected ...
s and
computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
.
Use
Their purpose is either to efficiently execute already trained AI models (inference) or to train AI models. Their applications include
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s for
robotics
Robotics is the interdisciplinary study and practice of the design, construction, operation, and use of robots.
Within mechanical engineering, robotics is the design and construction of the physical structures of robots, while in computer s ...
,
Internet of things
Internet of things (IoT) describes devices with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other communication networks. The IoT encompasse ...
, and
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
-intensive or sensor-driven tasks. They are often
manycore designs and focus on
low-precision arithmetic, novel
dataflow architecture
Dataflow architecture is a dataflow-based computer architecture that directly contrasts the traditional von Neumann architecture or control flow architecture. Dataflow architectures have no program counter, in concept: the executability and ex ...
s, or
in-memory computing
The term is used for two different things:
# In computer science, in-memory processing, also called compute-in-memory (CIM), or processing-in-memory (PIM), is a computer architecture in which data operations are available directly on the data ...
capability. , a typical AI
integrated circuit
An integrated circuit (IC), also known as a microchip or simply chip, is a set of electronic circuits, consisting of various electronic components (such as transistors, resistors, and capacitors) and their interconnections. These components a ...
chip
contains tens of billions of
MOSFET
upright=1.3, Two power MOSFETs in amperes">A in the ''on'' state, dissipating up to about 100 watt">W and controlling a load of over 2000 W. A matchstick is pictured for scale.
In electronics, the metal–oxide–semiconductor field- ...
s.
AI accelerators are used in mobile devices such as Apple
iPhone
The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
s and
Huawei
Huawei Technologies Co., Ltd. ("Huawei" sometimes stylized as "HUAWEI"; ; zh, c=华为, p= ) is a Chinese multinational corporationtechnology company in Longgang, Shenzhen, Longgang, Shenzhen, Guangdong. Its main product lines include teleco ...
cellphones, and personal computers such as
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
laptops,
AMD
Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
laptops and
Apple silicon
Apple silicon is a series of system on a chip (SoC) and system in a package (SiP) processors designed by Apple Inc., mainly using the ARM architecture family, ARM architecture. They are used in nearly all of the company's devices including Mac ...
Macs. Accelerators are used in
cloud computing
Cloud computing is "a paradigm for enabling network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand," according to International Organization for ...
servers, including
tensor processing units (TPU) in
Google Cloud Platform
Google Cloud Platform (GCP) is a suite of cloud computing services offered by Google that provides a series of modular cloud services including computing, Computer data storage, data storage, Data analysis, data analytics, and machine learnin ...
and
Trainium and
Inferentia chips in
Amazon Web Services
Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon.com, Amazon that provides Software as a service, on-demand cloud computing computing platform, platforms and Application programming interface, APIs to individuals, companies, and gover ...
. Many vendor-specific terms exist for devices in this category, and it is an
emerging technology
Emerging technologies are technologies whose development, practical applications, or both are still largely unrealized. These technologies are generally new but also include old technologies finding new applications. Emerging technologies are o ...
without a
dominant design
Dominant design is a technology management concept introduced by James M. Utterback and William J. Abernathy in 1975, identifying key technological features that become a de facto standard. A dominant design is the one that wins the allegiance of ...
.
Graphics processing units
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal co ...
designed by companies such as
Nvidia
Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
and
AMD
Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
often include AI-specific hardware, and are commonly used as AI accelerators, both for
training
Training is teaching, or developing in oneself or others, any skills and knowledge or fitness that relate to specific useful competencies. Training has specific goals of improving one's capability, capacity, productivity and performance. I ...
and
inference
Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...
. All models of Intel
Meteor Lake processors have a built-in ''versatile processor unit'' (''VPU'') for accelerating
inference
Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...
for computer vision and deep learning.
References
External links
Nvidia Puts The Accelerator To The Metal With Pascal The Next Platform
Eyeriss Project MIT
{{Hardware acceleration
Application-specific integrated circuits
Coprocessors
Computer optimization
Gate arrays
Deep learning