ROCm is an
Advanced Micro Devices
Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a Information technology, hardware and F ...
(AMD) software stack for
graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
(GPU) programming. ROCm spans several domains, including
general-purpose computing on graphics processing units
General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditional ...
(GPGPU),
high performance computing
High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems.
Overview
HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
(HPC), and
heterogeneous computing
Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incor ...
. It offers several programming models:
HIP (
GPU-kernel-based programming),
OpenMP
OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, ...
(
directive-based programming), and
OpenCL
OpenCL (Open Computing Language) is a software framework, framework for writing programs that execute across heterogeneous computing, heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), di ...
.
ROCm is free, libre and
open-source software
Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
(except the GPU
firmware blobs), and it is distributed under various licenses. ROCm initially stood for Radeon Open Compute platfor''m''; however, due to Open Compute being a registered trademark, ROCm is no longer an acronym — it is simply AMD's open-source stack designed for GPU compute.
Background
The first GPGPU software stack from
ATI/AMD was
Close to Metal, which became
Stream
A stream is a continuous body of water, body of surface water Current (stream), flowing within the stream bed, bed and bank (geography), banks of a channel (geography), channel. Depending on its location or certain characteristics, a strea ...
.
ROCm was launched around 2016 with the
Boltzmann Initiative. ROCm stack builds upon previous AMD GPU stacks; some tools trace back to
GPUOpen and others to the
Heterogeneous System Architecture (HSA).
Heterogeneous System Architecture Intermediate Language
HSAIL was aimed at producing a middle-level, hardware-agnostic intermediate representation that could be JIT-compiled to the eventual hardware (GPU, FPGA...) using the appropriate finalizer. This approach was dropped for ROCm: now it builds only GPU code, using
LLVM
LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a Compiler#Front end, frontend for any programming language and a Compiler#Back end, backend for any instruction set architecture. LLVM i ...
, and its
AMDGPU backend that was upstreamed, although there is still research on such enhanced modularity with LLVM MLIR.
Programming abilities
ROCm as a stack ranges from the kernel driver to the end-user applications.
AMD has introductory videos about AMD GCN hardware, and ROCm programming via its learning portal.
One of the best technical introductions about the stack and ROCm/HIP programming, remains, to date, to be found on Reddit.
Hardware support
ROCm is primarily targeted at discrete professional GPUs, but unofficial support includes the Vega family and
RDNA 2
RDNA 2 is a GPU microarchitecture designed by AMD, released with the Radeon RX 6000 series on November 18, 2020. Alongside powering the RX 6000 series, RDNA 2 is also featured in the SoCs designed by AMD for the PlayStation 5, Xbox Series X/S ...
consumer GPUs.
Accelerated Processor Units (APU) are "enabled", but not officially supported. Having ROCm functional there is involved.
Professional-grade GPUs
AMD Instinct accelerators are the first-class ROCm citizens, alongside th
prosumer Radeon Pro GPU series: they mostly see full support.
The only consumer-grade GPU that has relatively equal support is, as of January 2022, the Radeon VII (GCN 5 - Vega).
Consumer-grade GPUs
Software ecosystem
Learning resources
AMD ROCm product manager Terry Deem gave a tour of the stack.
Third-party integration
The main consumers of the stack are machine learning and high-performance computing/GPGPU applications.
Machine learning
Various
deep learning
Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
frameworks have a ROCm backend:
*
PyTorch
PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is one of the mo ...
*
TensorFlow
TensorFlow is a Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for Types of artificial neural networks#Training, training and Statistical infer ...
*
ONNX
*
MXNet
*
CuPy
MIOpen*
Caffe
Iree(which uses LLVM Multi-Level Intermediate Representation (MLIR))
llama.cpp
Supercomputing
ROCm is gaining significant traction in the
top 500.
ROCm is used with the Exascale supercomputers
El Capitan
El Capitan (; ) is a vertical Rock formations in the United States, rock formation in Yosemite National Park, on the north side of Yosemite Valley, near its western end. The El Capitan Granite, granite monolith is about from base to summit alo ...
and
Frontier
A frontier is a political and geographical term referring to areas near or beyond a boundary.
Australia
The term "frontier" was frequently used in colonial Australia in the meaning of country that borders the unknown or uncivilised, th ...
.
Some related software is to be found a
AMD Infinity hub
Other acceleration & graphics interoperation
As of version 3.0,
Blender
A blender (sometimes called a mixer (from Latin ''mixus, the PPP of miscere eng. to Mix)'' or liquidiser in British English) is a kitchen and laboratory appliance used to mix, crush, purée or emulsify food and other substances. A stationary ...
can now use HIP compute kernels for its
renderer cycles.
Other Languages
= Julia
=
Julia has the AMDGPU.jl package, which integrates with LLVM and selects components of the ROCm stack. Instead of compiling code through HIP, AMDGPU.jl uses Julia's compiler to generate LLVM IR directly, which is later consumed by LLVM to generate native device code. AMDGPU.jl uses ROCr's HSA implementation to upload native code onto the device and execute it, similar to how HIP loads its own generated device code.
AMDGPU.jl also supports integration with ROCm's rocBLAS (for BLAS), rocRAND (for random number generation), and rocFFT (for FFTs). Future integration with rocALUTION, rocSOLVER, MIOpen, and certain other ROCm libraries is planned.
Software distribution
Official
Installation instructions are provided for Linux and Windows in th
official AMD ROCm documentation ROCm software is currently spread across several public
GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
repositories. Within the main publi
meta-repository there is a
XML manifestfor each official release: usin
git-repo a
version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
tool built on top of
Git
Git () is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively.
Design goals of Git include speed, data integrity, and suppor ...
, is the recommended way to synchronize with the stack locally.
AMD starts distributing containerized applications for ROCm, notably scientific research applications gathered unde
AMD Infinity Hub
AM
distributes itselfpackages tailored to various Linux distributions.
Third-party
There is a growin
third-party ecosystem packaging ROCm
Linux distributions are officially packaging (natively) ROCm, with various degrees of advancement:
Arch Linux
Arch Linux () is an Open-source software, open source, rolling release Linux distribution. Arch Linux is kept up-to-date by regularly updating the individual pieces of software that it comprises. Arch Linux is intentionally minimal, and is meant ...
, Gentoo,
Debian
Debian () is a free and open-source software, free and open source Linux distribution, developed by the Debian Project, which was established by Ian Murdock in August 1993. Debian is one of the oldest operating systems based on the Linux kerne ...
,
Fedora
A fedora () is a hat with a soft brim and indented crown.Kilgour, Ruth Edwards (1958). ''A Pageant of Hats Ancient and Modern''. R. M. McBride Company. It is typically creased lengthwise down the crown and "pinched" near the front on both sides ...
,
GNU Guix, and
NixOS.
There are
Spack packages.
Components
There is one kernel-space component, ROCk, and the rest - there is roughly a hundred components in the stack - is made of
user-space modules.
The unofficial typographic policy is to use: uppercase ROC lowercase following for low-level libraries, i.e. ROCt, and the contrary for user-facing libraries, i.e. rocBLAS.
AMD is active developing with the LLVM community, but upstreaming is not instantaneous, and as of January 2022, is still lagging. AMD still officially packages various LLVM forks
for parts that are not yet upstreamed compiler optimizations destined to remain proprietary, debug support, OpenMP offloading, etc.
Low-level
ROCk Kernel driver
ROCm Device libraries
Support librariesimplemented as LLVM bitcode. These provide various utilities and functions for math operations, atomics, queries for launch parameters, on-device kernel launch, etc.
ROCt Thunk
Th
thunkis responsible for all the thinking and queuing that goes into the stack.
ROCr Runtime
Th
ROC runtimeis a set of APIs/libraries that allows the launch of compute kernels by host applications. It is AMD's implementation of the HSA runtime API. It is different from the ROC Common Language Runtime.
ROCm CompilerSupport
ROCm code object manageris in charge of interacting with LLVM
intermediate representation
An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
.
Mid-level
ROCclr Common Language Runtime
Th
common language runtimeis an indirection layer adapting calls to ROCr on Linux and PAL on windows.
It used to be able to route between different compilers, like the HSAIL-compiler. It is now being absorbed by the upper indirection layers (HIP and OpenCL).
OpenCL
ROCm ships its installable client driver (ICD) loader and an OpenC
implementation bundled together
As of January 2022, ROCm 4.5.2 ships OpenCL 2.2, and is lagging behind competition.
HIP
Heterogeneous Interface for Portability
The AMD implementation for its GPUs is calle
HIPAMD There is also
CPU implementationmostly for demonstration purposes.
HIPCC
HIP builds a `HIPCC` compiler that either wraps
Clang
Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
and compiles with LLVM open AMDGPU backend, or redirects to the
NVIDIA compiler.
HIPIFY
HIPIFYis a source-to-source compiling tool. It translates CUDA to HIP and reverse, either using a Clang-based tool, or a sed-like
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
script.
GPUFORT
Like HIPIFY
GPUFORTis a tool compiling source code into other third-generation-language sources, allowing users to migrate from CUDA Fortran to HIP Fortran. It is also in the repertoire of research projects, even more so.
High-level
ROCm high-level libraries are usually consumed directly by application software, such as
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
frameworks. Most of the following libraries are in the
General Matrix Multiply (GEMM) category, which GPU architecture excels at.
The majority of these user-facing libraries comes in dual-form: ''hip'' for the indirection layer that can route to Nvidia hardware, and ''roc'' for the AMD implementation.
rocBLAS / hipBLAS
rocBLASan
hipBLASare central in high-level libraries, it is the AMD implementation for
Basic Linear Algebra Subprograms
Basic Linear Algebra Subprograms (BLAS) is a specification (technical standard), specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector space, vector addition, scalar multiplicati ...
.
It uses the librar
Tensileprivately.
rocSOLVER / hipSOLVER
This pair of libraries constitutes the
LAPACK
LAPACK ("Linear Algebra Package") is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It als ...
implementation for ROCm and is strongly coupled to rocBLAS.
Utilities
ROCm developer tools Debug, tracer, profiler, System Management Interface, Validation suite, Cluster management.
GPUOpen tools GPU analyzer, memory visualizer...
* External tools: radeontop (
TUI overview)
Comparison with competitors
ROCm competes with other GPU computing stacks: Nvidia
CUDA
In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...
and
Intel OneAPI.
Nvidia CUDA
Nvidia's CUDA is closed-source, whereas AMD ROCm is open source. There is open-source software built on top of the closed-source CUDA, for instanc
RAPIDS
CUDA is able to run on consumer GPUs, whereas ROCm support is mostly offered for professional hardware such as
AMD Instinct
AMD Instinct is AMD's brand of data center Graphics processing unit, GPUs. It replaced AMD's AMD FirePro, FirePro S brand in 2016. Compared to the Radeon brand of mainstream consumer/gamer products, the Instinct product line is intended to acce ...
and
AMD Radeon Pro.
Nvidia provides a C/C++-centered frontend and its
Parallel Thread Execution (PTX) LLVM GPU backend as the
Nvidia CUDA Compiler (NVCC).
Intel OneAPI
Like ROCm, oneAPI is open source, and all the corresponding libraries are published on it
GitHub Page
Unified Acceleration Foundation (UXL)
Unified Acceleration Foundation (UXL) is a new technology consortium that are working on the continuation of the OneAPI initiative, with the goal to create a new open standard accelerator software ecosystem, related open standards and specification projects through Working Groups and Special Interest Groups (SIGs). The goal will compete with Nvidia's CUDA. The main companies behind it are Intel, Google, Arm, Qualcomm, Samsung, Imagination, and VMware.
See also
*
AMD Software
Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that d ...
– a general overview of AMD's drivers, APIs, and development endeavors.
*
GPUOpen – AMD's complementary graphics stack
*
AMD Radeon Software – AMD's software distribution channel
References
External links
*
*
*
*
*
* —
Docker containers for scientific applications.
{{Authority control
AMD software
Application programming interfaces
Concurrent computing
GPGPU
GPGPU libraries
Graphics cards
Graphics hardware
Heterogeneous computing
Machine learning
Parallel computing
Supercomputers