POWER9

picture info	POWER9 POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. It was announced in August 2016. The POWER9-based processors are being manufactured using a 14 nm FinFET process, in 12- and 24-core versions, for scale out and scale up applications, and possibly other variations, since the POWER9 architecture is open for licensing and modification by the OpenPOWER Foundation members. Summit, the ninth fastest supercomputer in the world (based on the Top500 list as of June 2024), is based on POWER9, while also using Nvidia Tesla GPUs as accelerators. Design Core The POWER9 core comes in two variants, a four-way multithreaded one called ''SMT4'' and an eight-way one called ''SMT8''. The SMT4- and SMT8-cores are similar, in that they consist of a number of so-called ''slices'' fed by common schedulers. A slice is a rudimentary 64-bit single-threaded processing core with load store unit (LSU), integer unit (ALU) and a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Power10 Power10 is a superscalar, multithreading, multi-core microprocessor family, based on the open source Power ISA, and announced in August 2020 at the Hot Chips conference; systems with Power10 CPUs. Generally available from September 2021 in the IBM Power10 Enterprise E1080 server. The processor is designed to have 15 cores available, but a spare core will be included during manufacture to cost-effectively allow for yield issues. Power10-based processors will be manufactured by Samsung using a 7 nm process with 18 layers of metal and 18 billion transistors on a 602 mm2 silicon die. The main features of Power10 are higher performance per watt and better memory and I/O architectures, with a focus on artificial intelligence (AI) workloads. Design Each Power10 core has doubled up on most functional units compared to its predecessor POWER9. The core is eight-way multithreaded (SMT8) and has 48 KB instruction and 32 KB data L1 caches, a 2 MB large L2 cac ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Coherent Accelerator Processor Interface Coherent Accelerator Processor Interface (CAPI), is a high-speed processor expansion bus standard for use in large data center computers, initially designed to be layered on top of PCI Express, for directly connecting central processing units (CPUs) to external accelerators like graphics processing units (GPUs), ASICs, FPGAs or fast storage. It offers low latency, high speed, direct memory access connectivity between devices of different instruction set architectures. History The performance scaling traditionally associated with Moore's Law—dating back to 1965—began to taper off around 2004, as both Intel's Prescott architecture and IBM's Cell processor pushed toward a 4 GHz operating frequency. Here both projects ran into a thermal scaling wall, whereby heat extraction problems associated with further increases in operating frequency largely outweighed gains from shorter cycle times. Over the decade that followed, few commercial CPU products exceeded 4 GHz, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Summit (supercomputer) Summit or OLCF-4 was a supercomputer developed by IBM for use at Oak Ridge Leadership Computing Facility (OLCF), a facility at the Oak Ridge National Laboratory, United States of America. It held the number 1 position on the TOP500 list from June 2018 to June 2020. As of June 2024, its LINPACK benchmark was clocked at 148.6 petaFLOPS. Summit was decommissioned on November 15, 2024. As of November 2019, the supercomputer had ranked as the 5th most energy efficient in the world with a measured power efficiency of 14.668 gigaFLOPS/watt. Summit was the first supercomputer to reach exaflop (a quintillion operations per second) speed, on a non-standard metric, achieving 1.88 exaflops during a genomic analysis and is expected to reach 3.3 exaflops using mixed-precision calculations. History The United States Department of Energy awarded a $325 million contract in November 2014 to IBM, Nvidia and Mellanox. The effort resulted in construction of Summit ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	NVLink NVLink is a wire-based serial multi-lane near-range communications protocol, communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub (network science), hub. The protocol was first announced in March 2014 and uses a proprietary high-speed signaling interconnect (NVHS). Principle NVLink is developed by Nvidia for data and control code transfers in processor systems between CPUs and GPUs and solely between GPUs. NVLink specifies a point-to-point (telecommunications), point-to-point connection with data rates of 20, 25 and 50 Gbit/s (v1.0/v2.0/v3.0+ resp.) per differential pair. For NVLink 1.0 and 2.0 eight differential pairs form a "sub-link" and two "sub-links", one for each direction, form a "link". Starting from NVlink 3.0 only four differential pairs form a "sub-link". For NVLink 2.0 and higher the total data rate for a sub-link is 25 GB/s and the tota ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Power ISA Power ISA is a reduced instruction set computer (RISC) instruction set architecture (ISA) currently developed by the OpenPOWER Foundation, led by IBM. It was originally developed by IBM and the now-defunct Power.org industry group. Power ISA is an evolution of the PowerPC ISA, created by the mergers of the core PowerPC ISA and the optional Book E for embedded applications. The merger of these two components in 2006 was led by Power.org founders IBM and Freescale Semiconductor. Prior to version 3.0, the ISA is divided into several categories. Processors implement a set of these categories as required for their task. Different classes of processors are required to implement certain categories, for example a server-class processor includes the categories: ''Base'', ''Server'', ''Floating-Point'', ''64-Bit'', etc. All processors implement the Base category. Power ISA is a RISC load/store architecture. It has multiple sets of registers: * ''32'' × 32-bit or 64-bit general- ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	PCI Express PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed standard used to connect hardware components inside computers. It is designed to replace older expansion bus standards such as Peripheral Component Interconnect, PCI, PCI-X and Accelerated Graphics Port, AGP. Developed and maintained by the PCI-SIG (PCI Special Interest Group), PCIe is commonly used to connect graphics cards, sound cards, Wi-Fi and Ethernet adapters, and storage devices such as solid-state drives and hard disk drives. Compared to earlier standards, PCIe supports faster data transfer, uses fewer pins, takes up less space, and allows devices to be added or removed while the computer is running (hot swapping). It also includes better error detection and supports newer features like I/O virtualization for advanced computing needs. PCIe connections are made through "lanes," which are pairs of wires that send and receive data. Devices can use one or more lanes ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	PowerVM PowerVM, formerly known as Advanced Power Virtualization (APV), is a chargeable feature of IBM POWER5, POWER6, POWER7, POWER8, POWER9 and Power10 servers and is required for support of micro-partitions and other advanced features. Support is provided for IBM i, AIX and Linux. Description IBM PowerVM has the following components: * A "VET" code, which activates firmware required to support resource sharing and other features. * Installation media for the Virtual I/O Server (VIOS), which is a service partition providing sharing services for disk and network adapters. Prior to its withdrawal from marketing in 2011, PowerVM also came with installation media for Lx86, x86 binary translation software, which allows Linux applications compiled for the Intel x86 platform to run in POWER-emulation mode. A supported Linux distribution was a co-requisite for use of this feature. IBM PowerVM comes in two editions: * IBM PowerVM Standard: supported on all POWER6, POWER7 and POWER8 sy ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	POWER8 POWER8 is a family of superscalar multi-core microprocessors based on the Power ISA, announced in August 2013 at the Hot Chips conference. The designs are available for licensing under the OpenPOWER Foundation, which is the first time for such availability of IBM's highest-end processors. Systems based on POWER8 became available from IBM in June 2014. Systems and POWER8 processor designs made by other OpenPOWER members were available in early 2015. Design POWER8 is designed to be a massively multithreaded chip, with each of its cores capable of handling eight hardware threads simultaneously, for a total of 96 threads executed simultaneously on a 12-core chip. The processor makes use of very large amounts of on- and off-chip eDRAM caches, and on-chip memory controllers enable very high bandwidth to memory and system I/O. For most workloads, the chip is said to perform two to three times as fast as its predecessor, the POWER7. POWER8 chips comes in 6- or 12-core variants; each ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Multithreading (computer Architecture) In computer architecture, multithreading is the ability of a central processing unit (CPU) (or a single core in a multi-core processor) to provide multiple threads of execution. Overview The multithreading paradigm has become more popular as efforts to further exploit instruction-level parallelism have stalled since the late 1990s. This allowed the concept of throughput computing to re-emerge from the more specialized field of transaction processing. Even though it is very difficult to further speed up a single thread or single program, most computer systems are actually multitasking among multiple threads or programs. Thus, techniques that improve the throughput of all tasks result in overall performance gains. Two major techniques for throughput computing are ''multithreading'' and ''multiprocessing''. Advantages If a thread gets a lot of cache misses, the other threads can continue taking advantage of the unused computing resources, which may lead to faster overall exe ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Multi-core A multi-core processor (MCP) is a microprocessor on a single integrated circuit (IC) with two or more separate central processing units (CPUs), called ''cores'' to emphasize their multiplicity (for example, ''dual-core'' or ''quad-core''). Each core reads and executes program instructions, specifically ordinary CPU instructions (such as add, move data, and branch). However, the MCP can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques. Manufacturers typically integrate the cores onto a single IC die, known as a ''chip multiprocessor'' (CMP), or onto multiple dies in a single chip package. As of 2024, the microprocessors used in almost all new personal computers are multi-core. A multi-core processor implements multiprocessing in a single physical package. Designers may couple cores in a multi-core device tightly or loosely. For example, cores may or may not share c ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Input/output In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs are the signals or data received by the system and outputs are the signals or data sent from it. The term can also be used as part of an action; to "perform I/O" is to perform an input or output operation. are the pieces of hardware used by a human (or other system) to communicate with a computer. For instance, a keyboard or computer mouse is an input device for a computer, while monitors and printers are output devices. Devices for communication between computers, such as modems and network cards, typically perform both input and output operations. Any interaction with the system by an interactor is an input and the reaction the system responds is called the output. The designation of a device as either input or output depend ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	EDRAM Embedded DRAM (eDRAM) is dynamic random-access memory (DRAM) integrated on the same die or multi-chip module (MCM) of an application-specific integrated circuit (ASIC) or microprocessor. eDRAM's cost-per-bit is higher when compared to equivalent standalone DRAM chips used as external memory, but the performance advantages of placing eDRAM onto the same chip as the processor outweigh the cost disadvantages in many applications. In performance and size, eDRAM is positioned between level 3 cache and conventional DRAM on the memory bus, and effectively functions as a level 4 cache, though architectural descriptions may not explicitly refer to it in those terms. Embedding memory on the ASIC or processor allows for much wider buses and higher operation speeds, and due to much higher density of DRAM in comparison to SRAM, larger amounts of memory can be installed on smaller chips if eDRAM is used instead of eSRAM. eDRAM requires additional fab process steps compared with embedded ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]