Nvidia HGX
   HOME

TheInfoList



OR:

The Nvidia DGX (Deep GPU Xceleration) represents a series of servers and
workstation A workstation is a special computer designed for technical or computational science, scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating syste ...
s designed by
Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
, primarily geared towards enhancing
deep learning Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
applications through the use of
general-purpose computing on graphics processing units General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditional ...
(GPGPU). These systems typically come in a rackmount format featuring high-performance
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
server CPUs on the motherboard. The core feature of a DGX system is its inclusion of 4 to 8
Nvidia Tesla Nvidia Tesla is the former name for a line of products developed by Nvidia targeted at stream processing or GPGPU, general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began us ...
GPU modules, which are housed on an independent system board. These GPUs can be connected either via a version of the SXM socket or a
PCIe PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed standard used to connect hardware components inside computers. It is designed to replace older expansion bus standards such as Peripher ...
x16 slot, facilitating flexible integration within the system architecture. To manage the substantial thermal output, DGX units are equipped with heatsinks and fans designed to maintain optimal operating temperatures. This framework makes DGX units suitable for computational tasks associated with artificial intelligence and machine learning models.


Models


Pascal - Volta


DGX-1

DGX-1 servers feature 8
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
s based on the Pascal or Volta daughter cards with 128 GB of total
HBM2 High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network de ...
memory, connected by an
NVLink NVLink is a wire-based serial multi-lane near-range communications protocol, communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central ...
mesh network A mesh network is a local area network topology in which the infrastructure nodes (i.e. bridges, switches, and other infrastructure devices) connect directly, dynamically and non-hierarchically to as many other nodes as possible and cooperate wit ...
. The DGX-1 was announced on 6 April 2016. All models are based on a dual socket configuration of Intel Xeon E5 CPUs, and are equipped with the following features. * 512 GB of DDR4-2133 * Dual 10 Gb networking * 4 x 1.92 TB SSDs * 3200W of combined power supply capability * 3U Rackmount Chassis The product line is intended to bridge the gap between GPUs and
AI accelerator A neural processing unit (NPU), also known as AI accelerator or deep learning processor, is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence (AI) and machine learning applications, inc ...
s using specific features for deep learning workloads. The initial Pascal-based DGX-1 delivered 170
teraflop Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measur ...
s of half precision processing, while the Volta-based upgrade increased this to 960
teraflop Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measur ...
s. The DGX-1 was first available in only the Pascal-based configuration, with the first generation SXM socket. The later revision of the DGX-1 offered support for first generation Volta cards via the SXM-2 socket. Nvidia offered upgrade kits that allowed users with a Pascal-based DGX-1 to upgrade to a Volta-based DGX-1. * The Pascal-based DGX-1 has two variants, one with a 16 core
Intel Xeon Xeon (; ) is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded markets. It was introduced in June 1998. Xeon processors are based on the same archite ...
E5-2698 V3, and one with a 20 core E5-2698 V4. Pricing for the variant equipped with an E5-2698 V4 is unavailable, the Pascal-based DGX-1 with an E5-2698 V3 was priced at launch at $129,000 * The Volta-based DGX-1 is equipped with an E5-2698 V4 and was priced at launch at $149,000.


DGX Station

Designed as a
turnkey A turnkey, a turnkey project, or a turnkey operation (also spelled turn-key) is a type of project that is constructed so that it can be sold to any buyer as a completed product. This is contrasted with build to order, where the constructor builds ...
deskside AI supercomputer, the DGX Station is a
tower A tower is a tall Nonbuilding structure, structure, taller than it is wide, often by a significant factor. Towers are distinguished from guyed mast, masts by their lack of guy-wires and are therefore, along with tall buildings, self-supporting ...
computer that can function completely independently without typical datacenter infrastructure such as cooling, redundant power, or 19 inch racks. The DGX station was first available with the following specifications. * Four Volta-based Tesla V100 accelerators, each with 16 GB of
HBM2 High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network de ...
memory * 480 TFLOPS FP16 * Single Intel Xeon E5-2698 v4 * 256 GB DDR4 * 4x 1.92 TB SSDs * Dual 10 Gb Ethernet The DGX station is
water-cooled Cooling tower and water discharge of a nuclear power plant Water cooling is a method of heat removal from components and industrial equipment. Evaporative cooling using water is often more efficient than air cooling. Water is inexpensive and no ...
to better manage the heat of almost 1500W of total system components, this allows it to keep a noise range under 35 dB under load. This, among other features, made this system a compelling purchase for customers without the infrastructure to run
rackmount A 19-inch rack is a standardized frame or enclosure for mounting multiple electronic equipment modules. Each module has a front panel that is wide. The 19 inch dimension includes the edges or ''ears'' that protrude from each side of the ...
DGX systems, which can be loud, output a lot of heat, and take up a large area. This was Nvidia's first venture into bringing
high performance computing High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
deskside, which has since remained a prominent marketing strategy for Nvidia.


DGX-2

The Nvidia DGX-2, the successor to the DGX-1, uses sixteen Volta-based V100 32 GB (second generation) cards in a single unit. It was announced on 27 March 2018. The DGX-2 delivers 2 Petaflops with 512 GB of shared memory for tackling massive datasets and uses NVSwitch for high-bandwidth internal communication. DGX-2 has a total of 512 GB of
HBM2 High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network de ...
memory, a total of 1.5 TB of
DDR4 Double Data Rate 4 Synchronous Dynamic Random-Access Memory (DDR4 SDRAM) is a type of synchronous dynamic random-access memory with a high bandwidth ("double data rate") interface. Released to the market in 2014, it is a variant of dynamic rando ...
. Also present are eight 100 Gbit/s
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
cards and 30.72 TB of SSD storage, all enclosed within a massive 10U rackmount chassis and drawing up to 10 kW under maximum load. The initial price for the DGX-2 was $399,000. The DGX-2 differs from other DGX models in that it contains two separate GPU daughterboards, each with eight GPUs. These boards are connected by an NVSwitch system that allows for full bandwidth communication across all GPUs in the system, without additional latency between boards. A higher performance variant of the DGX-2, the DGX-2H, was offered as well. The DGX-2H replaced the DGX-2's dual Intel Xeon Platinum 8168's with upgraded dual Intel Xeon Platinum 8174's. This upgrade does not increase core count per system, as both CPUs are 24 cores, nor does it enable any new functions of the system, but it does increase the base frequency of the CPUs from 2.7 GHz to 3.1 GHz.


Ampere


DGX A100 Server

Announced and released on May 14, 2020. The DGX A100 was the 3rd generation of DGX server, including 8
Ampere The ampere ( , ; symbol: A), often shortened to amp,SI supports only the use of symbols and deprecates the use of abbreviations for units. is the unit of electric current in the International System of Units (SI). One ampere is equal to 1 c ...
-based A100 accelerators. Also included is 15 TB of
PCIe PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed standard used to connect hardware components inside computers. It is designed to replace older expansion bus standards such as Peripher ...
gen 4
NVMe NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via the PCI Express bus. The in ...
storage, 1 TB of RAM, and eight
Mellanox Mellanox Technologies Ltd. () was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, software, cables and silicon for markets including high ...
-powered 200 GB/s HDR InfiniBand ConnectX-6 NICs. The DGX A100 is in a much smaller enclosure than its predecessor, the DGX-2, taking up only 6 Rack units. The DGX A100 also moved to a 64 core
AMD EPYC Epyc (stylized as EPYC) is a brand of multi-core x86-64 microprocessors designed and sold by AMD, based on the company's Zen microarchitecture. Introduced in June 2017, they are specifically targeted for the server and embedded system markets ...
7742 CPU, the first DGX server to not be built with an Intel Xeon CPU. The initial price for the DGX A100 Server was $199,000.


DGX Station A100

As the successor to the original DGX Station, the DGX Station A100, aims to fill the same niche as the DGX station in being a quiet, efficient, turnkey cluster-in-a-box solution that can be purchased, leased, or rented by smaller companies or individuals who want to utilize machine learning. It follows many of the design choices of the original DGX station, such as the tower orientation, single socket CPU
mainboard A motherboard, also called a mainboard, a system board, a logic board, and informally a mobo (see "Nomenclature" section), is the main printed circuit board (PCB) in general-purpose computers and other expandable systems. It holds and allow ...
, a new refrigerant-based cooling system, and a reduced number of accelerators compared to the corresponding rackmount DGX A100 of the same generation. The price for the DGX Station A100 320G is $149,000 and $99,000 for the 160G model, Nvidia also offers Station rental at ~US$9000 per month through partners in the US (rentacomputer.com) and Europe (iRent IT Systems) to help reduce the costs of implementing these systems at a small scale. The DGX Station A100 comes with two different configurations of the built in A100. * Four Ampere-based A100 accelerators, configured with 40 GB (HBM) or 80 GB (HBM2e) memory,
thus giving a total of 160 GB or 320 GB resulting either in DGX Station A100 variants 160G or 320G. * 2.5 PFLOPS FP16 * Single 64 Core
AMD EPYC Epyc (stylized as EPYC) is a brand of multi-core x86-64 microprocessors designed and sold by AMD, based on the company's Zen microarchitecture. Introduced in June 2017, they are specifically targeted for the server and embedded system markets ...
7742 * 512 GB
DDR4 Double Data Rate 4 Synchronous Dynamic Random-Access Memory (DDR4 SDRAM) is a type of synchronous dynamic random-access memory with a high bandwidth ("double data rate") interface. Released to the market in 2014, it is a variant of dynamic rando ...
* 1 x 1.92 TB
NVMe NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via the PCI Express bus. The in ...
OS drive * 1 x 7.68 TB U.2 NVMe Drive * Dual port 10 Gb Ethernet * Single port 1 Gb BMC port


Hopper


DGX H100 Server

Announced March 22, 2022 and planned for release in Q3 2022, The DGX H100 is the 4th generation of DGX servers, built with 8
Hopper Hopper or hoppers may refer to: Places * Hopper, Illinois * Hopper, West Virginia * Hopper, a mountain and valley in the Hunza–Nagar District of Pakistan * Hopper (crater), a crater on Mercury People * Hopper (surname) Insects * Hopper, the ...
-based H100 accelerators, for a total of 32 PFLOPs of FP8 AI compute and 640 GB of HBM3 Memory, an upgrade over the DGX A100s 640GB HBM2 memory. This upgrade also increases
VRAM Video random-access memory (VRAM) is dedicated computer memory used to store the pixels and other graphics data as a framebuffer to be rendered on a computer monitor. It often uses a different technology than other computer memory, in order to ...
bandwidth to 3 TB/s. The DGX H100 increases the
rackmount A 19-inch rack is a standardized frame or enclosure for mounting multiple electronic equipment modules. Each module has a front panel that is wide. The 19 inch dimension includes the edges or ''ears'' that protrude from each side of the ...
size to 8U to accommodate the 700W TDP of each H100 SXM card. The DGX H100 also has two 1.92 TB SSDs for
Operating System An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
storage, and 30.72 TB of
Solid state storage A solid-state drive (SSD) is a type of solid-state storage device that uses integrated circuits to store data persistently. It is sometimes called semiconductor storage device, solid-state device, or solid-state disk. SSDs rely on non-v ...
for application data. One more notable addition is the presence of two Nvidia Bluefield 3 DPUs, and the upgrade to 400 Gbit/s InfiniBand via
Mellanox Mellanox Technologies Ltd. () was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, software, cables and silicon for markets including high ...
ConnectX-7 NICs, double the bandwidth of the DGX A100. The DGX H100 uses new 'Cedar Fever' cards, each with four ConnectX-7 400 GB/s controllers, and two cards per system. This gives the DGX H100 3.2 Tbit/s of fabric bandwidth across Infiniband. The DGX H100 has two Xeon Platinum 8480C Scalable CPUs (Codenamed
Sapphire Rapids Sapphire Rapids is a codename for Intel's server (fourth generation Xeon Scalable) and workstation (Xeon W-2400/2500 and Xeon W-3400/3500) processors based on the Golden Cove microarchitecture and produced using Intel 7. It features up to 60 c ...
) and 2 Terabytes of System Memory. The DGX H100 was priced at £379,000 or ~US$482,000 at release.


DGX GH200

Announced May 2023, the DGX GH200 connects 32 Nvidia Hopper Superchips into a singular superchip, that consists totally of 256 H100 GPUs, 32 Grace Neoverse V2 72-core CPUs, 32 OSFT single-port ConnectX-7 VPI of with 400 Gbit/s InfiniBand and 16 dual-port BlueField-3 VPI with 200 Gbit/s of
Mellanox Mellanox Technologies Ltd. () was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, software, cables and silicon for markets including high ...
br>
. Nvidia DGX GH200 is designed to handle terabyte-class models for massive recommender systems, generative AI, and graph analytics, offering 19.5 TB of shared memory with linear scalability for giant AI models.


DGX Helios

Announced May 2023, the DGX Helios supercomputer features 4 DGX GH200 systems. Each is interconnected with Nvidia Quantum-2 InfiniBand networking to supercharge data throughput for training large AI models. Helios includes 1,024 H100 GPUs.


Blackwell


DGX GB200

Announced March 2024,  GB200 NVL72 connects 36 Grace Neoverse V2 72-core CPUs and 72 B100 GPUs in a rack-scale design. The GB200 NVL72 is a liquid-cooled, rack-scale solution that boasts a 72-GPU NVLink domain that acts as a single massive GPU. Nvidia DGX GB200 offers 13.5 TB HBM3e of shared memory with linear scalability for giant AI models, less than its predecessor DGX GH200.


DGX SuperPod

The DGX Superpod is a high performance turnkey
supercomputer A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...
system provided by Nvidia using DGX hardware. It combines DGX compute nodes with fast storage and high bandwidth networking to provide a solution to high demand machine learning workloads. The Selene supercomputer, at the
Argonne National Laboratory Argonne National Laboratory is a Federally funded research and development centers, federally funded research and development center in Lemont, Illinois, Lemont, Illinois, United States. Founded in 1946, the laboratory is owned by the United Sta ...
, is one example of a DGX SuperPod-based system. Selene, built from 280 DGX A100 nodes, ranked 5th on the
TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
list for most powerful supercomputers at the time of its completion in June 2020, and has continued to remain high in performance. The new Hopper-based SuperPod can scale to 32 DGX H100 nodes, for a total of 256 H100 GPUs and 64 x86 CPUs. This gives the complete SuperPod 20 TB of HBM3 memory, 70.4 TB/s of bisection bandwidth, and up to 1 ExaFLOP of FP8 AI compute. These SuperPods can then be further joined to create larger supercomputers. The Eos supercomputer, designed, built, and operated by Nvidia, was constructed of 18 H100-based SuperPods, totaling 576 DGX H100 systems, 500 Quantum-2
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
switches, and 360 NVLink Switches, that allow Eos to deliver 18 EFLOPs of FP8 compute, and 9 EFLOPs of FP16 compute, making Eos the 5th fastest AI supercomputer in the world, according to TOP500 (November 2023 edition). As Nvidia does not produce any storage devices or systems, Nvidia SuperPods rely on partners to provide high performance storage. Current storage partners for Nvidia Superpods are
Dell EMC EMC Corporation (stylized as EMC²) was an American multinational corporation headquartered in Hopkinton, Massachusetts, which sold data storage, information security, virtualization, analytics, cloud computing and other products and services th ...
, DDN, HPE,
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
,
NetApp NetApp, Inc. is an American data infrastructure company that provides unified data storage, integrated data services, and cloud operations (CloudOps) solutions to enterprise customers. The company is based in San Jose, California. It has ranked ...
, Pavilion Data, and
VAST Data VAST Data is a privately held technology company focused on artificial intelligence (AI) and deep learning computing infrastructure. The company offers a data computing platform to help enterprises and cloud service providers manage and accelerat ...
.


Accelerators


See also

*
Deep Learning Super Sampling Deep Learning Super Sampling (DLSS) is a suite of Real-time computing, real-time deep learning image enhancement and Image scaling, upscaling technologies developed by Nvidia that are available in a number of video games. The goal of these technol ...


References

{{Nvidia Neural processing units GPGPU Nvidia products Parallel computing