AMD Instinct is

AMD Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...

's brand of data center

GPUs A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...

. It replaced AMD's FirePro S brand in 2016. Compared to the

Radeon Radeon () is a brand of computer products, including graphics processing units, random-access memory, RAM disk software, and solid-state drives, produced by Radeon Technologies Group, a division of AMD. The brand was launched in 2000 by ATI Tech ...

brand of mainstream consumer/gamer products, the Instinct product line is intended to accelerate deep learning,

artificial neural network In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks. A neural network consists of connected ...

, and

high-performance computing High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into ...

GPGPU General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditiona ...

applications. The AMD Instinct product line directly competes with

Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...

's Tesla and

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...

Xeon Phi Xeon Phi is a discontinued series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and applicati ...

and Data Center GPU lines of machine learning and GPGPU cards. The brand was originally known as AMD Radeon Instinct, but AMD dropped the Radeon brand from the name before AMD Instinct MI100 was introduced in November 2020. In June 2022,

supercomputer A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...

s based on AMD's

Epyc Epyc (stylized as EPYC) is a brand of multi-core x86-64 microprocessors designed and sold by AMD, based on the company's Zen microarchitecture. Introduced in June 2017, they are specifically targeted for the server and embedded system market ...

CPUs and Instinct GPUs took the lead on the

Green500 The Green500 is a biannual ranking of supercomputers, from the TOP500 list of supercomputers, in terms of energy efficiency. The list measures performance per watt using the TOP500 measure of high performance LINPACK benchmarks at double-preci ...

list of the most power-efficient supercomputers with over 50% lead over any other, and held the top first 4 spots. One of them, the AMD-based

Frontier A frontier is a political and geographical term referring to areas near or beyond a boundary. Australia The term "frontier" was frequently used in colonial Australia in the meaning of country that borders the unknown or uncivilised, th ...

is since June 2022 and as of 2023 the fastest supercomputer in the world on the

TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...

list.

Products

The three initial Radeon Instinct products were announced on December 12, 2016, and released on June 20, 2017, with each based on a different architecture.

MI6

The MI6 is a passively cooled, Polaris 10 based card with 16 GB of

GDDR5 Graphics Double Data Rate 5 Synchronous Dynamic Random-Access Memory (GDDR5 SDRAM) is a type of synchronous graphics random-access memory (SGRAM) with a high bandwidth ("double data rate") interface designed for use in graphics cards, game con ...

memory and with a <150 W TDP. At 5.7

TFLOPS Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measur ...

(

FP16 In computing, half precision (sometimes called FP16 or float16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory. It is intended for storage of floating-point values in a ...

and FP32), the MI6 is expected to be used primarily for inference, rather than neural network training. The MI6 has a peak double precision (FP64) compute performance of 358 GFLOPS.

MI8

The MI8 is a

Fiji Fiji, officially the Republic of Fiji, is an island country in Melanesia, part of Oceania in the South Pacific Ocean. It lies about north-northeast of New Zealand. Fiji consists of an archipelago of more than 330 islands—of which about ...

based card, analogous to the R9 Nano, has a <175W TDP. The MI8 has 4 GB of

High Bandwidth Memory High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network ...

. At 8.2 TFLOPS (FP16 and FP32), the MI8 is marked toward inference. The MI8 has a peak (FP64) double precision compute performance 512 GFLOPS.

MI25

The MI25 is a

Vega Vega is the brightest star in the northern constellation of Lyra. It has the Bayer designation α Lyrae, which is Latinised to Alpha Lyrae and abbreviated Alpha Lyr or α Lyr. This star is relatively close at only from the Sun, and ...

based card, utilizing HBM2 memory. The MI25 performance is expected to be 12.3 TFLOPS using FP32 numbers. In contrast to the MI6 and MI8, the MI25 is able to increase performance when using lower precision numbers, and accordingly is expected to reach 24.6 TFLOPS when using FP16 numbers. The MI25 is rated at <300W TDP with passive cooling. The MI25 also provides 768 GFLOPS peak double precision (FP64) at 1/16th rate.

MI300 series

The MI300A and MI300X are data center accelerators that use the CDNA 3 architecture, which is optimized for high-performance computing (HPC) and generative artificial intelligence (AI) workloads. The CDNA 3 architecture features a scalable chiplet design that leverages TSMC’s advanced packaging technologies, such as CoWoS (chip-on-wafer-on-substrate) and InFO (integrated fan-out), to combine multiple chiplets on a single interposer. The chiplets are interconnected by AMD’s Infinity Fabric, which enables high-speed and low-latency data transfer between the chiplets and the host system. The MI300A is an accelerated processing unit (APU) that integrates 24

Zen 4 Zen 4 is the name for a CPU microarchitecture designed by AMD, released on September 27, 2022. It is the successor to Zen 3 and uses TSMC's N6 process for I/O dies, N5 process for CCDs, and N4 process for APUs. Zen 4 powers Ryzen 7000 per ...

CPU cores with four CDNA 3 GPU cores, resulting in a total of 228 CUs in the GPU section, and 128 GB of HBM3 memory. The Zen 4 CPU cores are based on the 5 nm process node and support the x86-64 instruction set, as well as AVX-512 and BFloat16 extensions. The Zen 4 CPU cores can run general-purpose applications and provide host-side computation for the GPU cores. The MI300A has a peak performance of 61.3 TFLOPS of FP64 (122.6 TFLOPS FP64 matrix) and 980.6 TFLOPS of FP16 (1961.2 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300A supports PCIe 5.0 and CXL 2.0 interfaces, which allow it to communicate with other devices and accelerators in a heterogeneous system. The MI300X is a dedicated generative AI accelerator that replaces the CPU cores with additional GPU cores and HBM memory, resulting in a total of 304 CUs (64 cores per CU) and 192 GB of HBM3 memory. The MI300X is designed to accelerate generative AI applications, such as natural language processing, computer vision, and deep learning. The MI300X has a peak performance of 653.7 TFLOPS of TP32 (1307.4 TFLOPS with sparsity) and 1307.4 TFLOPS of FP16 (2614.9 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300X also supports PCIe 5.0 and CXL 2.0 interfaces, as well as AMD’s ROCm software stack, which provides a unified programming model and tools for developing and deploying generative AI applications on AMD hardware.

Software

ROCm

Following software is, as of 2022, regrouped under the Radeon Open Compute meta-project.

MxGPU

The MI6, MI8, and MI25 products all support AMD's MxGPU

virtualization In computing, virtualization (abbreviated v12n) is a series of technologies that allows dividing of physical computing resources into a series of virtual machines, operating systems, processes or containers. Virtualization began in the 1960s wit ...

technology, enabling sharing of GPU resources across multiple users.

MIOpen

MIOpen is AMD's deep learning library to enable GPU acceleration of deep learning. Much of this extends the GPUOpen's Boltzmann Initiative software. This is intended to compete with the deep learning portions of Nvidia's

CUDA In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...

library. It supports the deep learning frameworks: Theano, Caffe,

TensorFlow TensorFlow is a Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for Types of artificial neural networks#Training, training and Statistical infer ...

, MXNet,

Microsoft Cognitive Toolkit Microsoft Cognitive Toolkit, previously known as CNTK and sometimes styled as The Microsoft Cognitive Toolkit, is a deprecated deep learning framework developed by Microsoft Research. Microsoft Cognitive Toolkit describes neural networks A ne ...

Torch A torch is a stick with combustible material at one end which can be used as a light source or to set something on fire. Torches have been used throughout history and are still used in processions, symbolic and religious events, and in juggl ...

, and Chainer. Programming is supported in

OpenCL OpenCL (Open Computing Language) is a software framework, framework for writing programs that execute across heterogeneous computing, heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), di ...

and Python, in addition to supporting the compilation of CUDA through AMD's Heterogeneous-compute Interface for Portability and Heterogeneous Compute Compiler.

Chipset table

References

External links

AMD Instinct accelerators web page
{{Graphics Processing Unit Coprocessors AMD graphics cards GPGPU Parallel computing