In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, a compute kernel is a routine compiled for high throughput
accelerators
Accelerator may refer to:
In science and technology
In computing
*Download accelerator, or download manager, software dedicated to downloading
*Hardware acceleration, the use of dedicated hardware to perform functions faster than a CPU
** Gr ...
(such as
graphics processing units (GPUs),
digital signal processors (DSPs) or
field-programmable gate arrays (FPGAs)), separate from but used by a main program (typically running on a
central processing unit
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
). They are sometimes called compute shaders, sharing
execution unit
In computer engineering, an execution unit (E-unit or EU) is a part of the central processing unit (CPU) that performs the operations and calculations as instructed by the computer program. It may have its own internal control sequence unit (not ...
s with
vertex shaders
In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene - a process known as '' shading''. Shaders have evolved to perform a variety of ...
and
pixel shaders
In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene - a process known as '' shading''. Shaders have evolved to perform a variety of ...
on GPUs, but are not limited to execution on one class of device, or
graphics APIs.
Description
Compute kernels roughly correspond to
inner loops when implementing algorithms in traditional languages (except there is no implied sequential operation), or to code passed to
internal iterators.
They may be specified by a separate
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming l ...
such as "
OpenCL C" (managed by the
OpenCL
OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-prog ...
API), as "compute
shader
In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene - a process known as '' shading''. Shaders have evolved to perform a variety of spec ...
s" written in a
shading language (managed by a graphics API such as
OpenGL
OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve ha ...
), or embedded directly in
application code written in a
high level language, as in the case of
C++AMP.
Vector processing
This
programming paradigm
Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms.
Some paradigms are concerned mainly with implications for the execution model of the language, s ...
maps well to
vector processor
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
s: there is an assumption that each invocation of a kernel within a batch is independent, allowing for
data parallel execution. However,
atomic operations may sometimes be used for
synchronization between elements (for interdependent work), in some scenarios. Individual invocations are given indices (in 1 or more dimensions) from which arbitrary addressing of buffer data may be performed (including
scatter gather operations), so long as the non-overlapping assumption is respected.
Vulkan API
The
Vulkan API provides the intermediate
SPIR-V representation to describe ''both''
Graphical Shaders, and Compute Kernels, in a
language independent and
machine independent manner. The intention is to facilitate language evolution and provide a more natural ability to leverage GPU compute capabilities, in line with hardware developments such as
Unified Memory Architecture
Unified may refer to:
* The Unified, a wine symposium held in Sacramento, California, USA
* ''Unified'', the official student newspaper of Canterbury Christ Church University
, mottoeng = The truth shall set you free
, estab ...
and
Heterogeneous System Architecture. This allows closer cooperation between a CPU and GPU.
See also
*
Kernel (image processing)
*
DirectCompute
*
CUDA
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
*
OpenMP
OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating sy ...
*
OpenCL
OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-prog ...
*
SPIR-V
*
SYCL
*
Metal (API)
*
GPGPU
*
Vector processor
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
*
Xeon Phi
*
*
Digital signal processor
*
Field-programmable gate array
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term '' field-programmable''. The FPGA configuration is generally specified using a hardware ...
*
AI accelerator
An AI accelerator is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications ...
**
Vision processing unit
A vision processing unit (VPU) is (as of 2018) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks.
Overview
Vision processing units are distinct from video processing un ...
*
Manycore
*
Stream processing
*
Computer for operations with functions
Within computer engineering and computer science, a computer for operations with (mathematical) functions (unlike the usual computer) operates with functions at the hardware level (i.e. without programming these operations).see also here http: ...
References
GPGPU
Parallel computing
{{Graphics Processing Unit