Nvidia HGX
The Nvidia DGX (Deep GPU Xceleration) represents a series of servers and workstations designed by Nvidia, primarily geared towards enhancing deep learning applications through the use of general-purpose computing on graphics processing units (GPGPU). These systems typically come in a rackmount format featuring high-performance x86 server CPUs on the motherboard. The core feature of a DGX system is its inclusion of 4 to 8 Nvidia Tesla GPU modules, which are housed on an independent system board. These GPUs can be connected either via a version of the SXM socket or a PCIe x16 slot, facilitating flexible integration within the system architecture. To manage the substantial thermal output, DGX units are equipped with heatsinks and fans designed to maintain optimal operating temperatures. This framework makes DGX units suitable for computational tasks associated with artificial intelligence and machine learning models. Models Pascal - Volta DGX-1 DGX-1 servers feature 8 GPUs ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Nvidia
Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curtis Priem, it designs and supplies graphics processing units (GPUs), application programming interfaces (APIs) for data science and high-performance computing, and system on a chip units (SoCs) for mobile computing and the automotive market. Nvidia is also a leading supplier of artificial intelligence (AI) hardware and software. Nvidia outsources the manufacturing of the hardware it designs. Nvidia's professional line of GPUs are used for edge-to-cloud computing and in supercomputers and workstations for applications in fields such as architecture, engineering and construction, media and entertainment, automotive, scientific research, and manufacturing design. Its GeForce line of GPUs are aimed at the consumer market and are used in ap ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Teraflop
Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measure than measuring instructions per second. Floating-point arithmetic Floating-point arithmetic is needed for very large or very small real numbers, or computations that require a large dynamic range. Floating-point representation is similar to scientific notation, except computers use Binary number, base two (with rare exceptions), rather than Decimal, base ten. The encoding scheme stores the sign, the exponent (in base two for Cray and VAX, base two or ten for IEEE floating point formats, and base 16 for IBM hexadecimal floating-point, IBM Floating Point Architecture) and the significand (number after the radix point). While several similar formats are in use, the most common is IEEE 754-1985, ANSI/IEEE Std. 754-1985. This standard defin ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Mellanox
Mellanox Technologies Ltd. () was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, software, cables and silicon for markets including high-performance computing, data centers, cloud computing, computer data storage and financial services. On March 11, 2019, Nvidia announced its intent to acquire the company for $6.9 billion. The deal closed on April 27, 2020, with approval from the EU, U.S. and Chinese antitrust authorities. The company was integrated into Nvidia's networking division in 2020 and Nvidia stopped using the brand name "Mellanox" for its new networking products. History 1999–2009 Mellanox was founded in May 1999 by former Israeli executives of Intel Corporation and Galileo Technology (which was acquired by Marvell Technology Group in October 2000 for $2.8 billion) Eyal Waldman, Shai Cohen, Roni Ashuri, Michael Kagan, Evelyn Landman, Eitan Zahavi, Shimon ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
NVMe
NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via the PCI Express bus. The initial ''NVM'' stands for '' non-volatile memory'', which is often NAND flash memory that comes in several physical form factors, including solid-state drives (SSDs), PCIe add-in cards, and M.2 cards, the successor to mSATA cards. NVM Express, as a logical-device interface, has been designed to capitalize on the low latency and internal parallelism of solid-state storage devices. Architecturally, the logic for NVMe is physically stored within and executed by the NVMe controller chip that is physically co-located with the storage media, usually an SSD. Version changes for NVMe, e.g., 1.3 to 1.4, are incorporated within the storage media, and do not affect PCIe-compatible components such as motherboards and CPUs. By its design, NVM Expre ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
PCI Express
PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed standard used to connect hardware components inside computers. It is designed to replace older expansion bus standards such as Peripheral Component Interconnect, PCI, PCI-X and Accelerated Graphics Port, AGP. Developed and maintained by the PCI-SIG (PCI Special Interest Group), PCIe is commonly used to connect graphics cards, sound cards, Wi-Fi and Ethernet adapters, and storage devices such as solid-state drives and hard disk drives. Compared to earlier standards, PCIe supports faster data transfer, uses fewer pins, takes up less space, and allows devices to be added or removed while the computer is running (hot swapping). It also includes better error detection and supports newer features like I/O virtualization for advanced computing needs. PCIe connections are made through "lanes," which are pairs of wires that send and receive data. Devices can use one or more lanes ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Ampere (microarchitecture)
Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020, and is named after French mathematician and physicist André-Marie Ampère. Nvidia announced the Ampere architecture GeForce 30 series consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced the A100 80 GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 based on the Ampere architecture were revealed on January 12, 2021. Nvidia announced Ampere's successor, Hopper, at GTC 2022, and "Ampere Next Next" ( Blackwell) for a 2024 release at GPU Technology Conference 2021. Details Architectural improvements of the Ampere architecture include the following: * CUDA Compute Capability 8.0 for A100 and 8.6 for the GeForce 30 series * TSMC's 7 nm FinFET process for A100 * Custom version of Samsung's 8 nm pr ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
InfiniBand
InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used as either a direct or switched interconnect between servers and storage systems, as well as an interconnect between storage systems. It is designed to be scalable and uses a switched fabric network topology. Between 2014 and June 2016, it was the most commonly used interconnect in the TOP500 list of supercomputers. Mellanox (acquired by Nvidia) manufactures InfiniBand host bus adapters and network switches, which are used by large computer system and database vendors in their product lines. As a computer cluster interconnect, IB competes with Ethernet, Fibre Channel, and Intel Omni-Path. The technology is promoted by the InfiniBand Trade Association. History InfiniBand originated in 1999 from the merger of two competing designs: ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
High-performance Computing
High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into a multidisciplinary field that combines digital electronics, computer architecture, system software, programming languages, algorithms and computational techniques. HPC technologies are the tools and systems used to implement and create high performance computing systems. Recently, HPC systems have shifted from supercomputing to computing clusters and grids. Because of the need of networking in clusters and grids, High Performance Computing Technologies are being promoted by the use of a collapsed network backbone, because the collapsed backbone architecture is simple to troubleshoot and upgrades can be applied to a single router as opposed to multiple ones. HPC integrates with data analytics in AI engineering workflows to generate ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Water Cooling
file:KKP Auslauf.jpg, Cooling tower and water discharge of a nuclear power plant Water cooling is a method of heat removal from components and industrial equipment. Evaporative cooling using water is often more efficient than air cooling. Water is inexpensive and non-toxic; however, it can contain impurities and cause corrosion. Water cooling is commonly used for cooling automobile internal combustion engines and power stations. Water coolers utilising Convection (heat transfer), convective heat transfer are used inside high-end personal computers to lower the temperature of CPUs and other components. Other uses include the cooling of lubricant oil in pumps; for cooling purposes in heat exchangers; for cooling buildings in Heating, ventilation, and air conditioning, HVAC and in chillers. Mechanism Advantages Water is inexpensive, non-toxic, and available over most of the earth's surface. Liquid cooling offers higher thermal conductivity than air cooling. Water has unusually hi ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
High Bandwidth Memory
High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network devices, high-performance datacenter AI ASICs, as on-package cache in CPUs and on-package RAM in upcoming CPUs, and FPGAs and in some supercomputers (such as the NEC SX-Aurora TSUBASA and Fujitsu A64FX). The first HBM memory chip was produced by SK Hynix in 2013, and the first devices to use HBM were the AMD Fiji GPUs in 2015. HBM was adopted by JEDEC as an industry standard in October 2013.High Bandwidth Memory (HBM) DRAM (JESD235) JEDEC, October 2013 The second generation, HBM2, was accepted by JEDEC in January 2016. [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
19-inch Rack
A 19-inch rack is a standardized frame or enclosure for mounting multiple electronic equipment modules. Each module has a front panel that is wide. The 19 inch dimension includes the edges or ''ears'' that protrude from each side of the equipment, allowing the module to be fastened to the rack frame with screws or bolts. Common uses include computer servers, telecommunications equipment and networking hardware, audiovisual production gear, professional audio equipment, and scientific equipment. Overview and history Equipment designed to be placed in a rack is typically described as rack-mount, rack-mount instrument, a rack-mounted system, a rack-mount chassis, subrack, rack cabinet, rack-mountable, or occasionally simply shelf. The height of the electronic modules is also standardized as multiples of or one rack unit or U (less commonly RU). The industry-standard rack cabinet is 42U tall; however, many data centers have racks taller than this. The term relay rack ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Computer Tower
In personal computing, a tower unit, or simply a tower, is a form factor of desktop computer case whose height is much greater than its width, thus having the appearance of an upstanding tower block, as opposed to a traditional " pizza box" computer case whose width is greater than its height and appears lying flat. Compared to a pizza box case, the tower tends to be larger and offers more potential for internal volume for the same desk area occupied, and therefore allows more hardware installation and theoretically better airflow for cooling. Multiple size subclasses of the tower form factor have been established to differentiate their varying sizes, including full-tower, mid-tower, midi-tower, mini-tower, and deskside; these classifications are however nebulously defined and inconsistently applied by different manufacturers. Although the traditional layout for a tower system is to have the case placed on top of the desk alongside the monitor and other peripherals, a far more ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |