HPC Challenge Benchmark combines several
benchmark
Benchmark may refer to:
Business and economics
* Benchmarking, evaluating performance within organizations
* Benchmark price
* Benchmark (crude oil), oil-specific practices
Science and technology
* Benchmark (surveying), a point of known elevati ...
s to test a number of independent attributes of the performance of high-performance
computer (HPC) systems. The project has been co-sponsored by the
DARPA
The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military.
Originally known as the Ad ...
High Productivity Computing Systems
High Productivity Computing Systems (HPCS) is a DARPA project for developing a new generation of economically viable high productivity computing systems for national security and industry in the 2002–10 timeframe.
The HPC Challenge (High-perf ...
program, the
United States Department of Energy
The United States Department of Energy (DOE) is an executive department of the U.S. federal government that oversees U.S. national energy policy and manages the research and development of nuclear power and nuclear weapons in the United Stat ...
and the
National Science Foundation
The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National ...
.
Context
The performance of complex applications on HPC systems can depend on a variety of independent performance attributes of the hardware. The HPC Challenge Benchmark is an effort to improve visibility into this multidimensional space by combining the measurement of several of these attributes into a single program.
Although the performance attributes of interest are not specific to any particular computer architecture, the reference implementation of the HPC Challenge Benchmark in
C and
MPI assumes that the system under test is a
cluster
may refer to:
Science and technology Astronomy
* Cluster (spacecraft), constellation of four European Space Agency spacecraft
* Asteroid cluster, a small asteroid family
* Cluster II (spacecraft), a European Space Agency mission to study th ...
of
shared memory multiprocessor systems connected by a
network
Network, networking and networked may refer to:
Science and technology
* Network theory, the study of graphs as a representation of relations between discrete objects
* Network science, an academic field that studies complex networks
Mathematics ...
. Due to this assumption of a hierarchical system structure most of the tests are run in several different modes of operation. Following the notation used by the benchmark reports, results labeled "single" mean that the test was run on one randomly chosen processor in the system, results labeled "star" mean that an independent copy of the test was run concurrently on each processor in the system, and results labeled "global" mean that all the processors were working in coordination to solve a single problem (with data distributed across the nodes of the system).
Components
The benchmark currently consists of 7 tests (with the modes of operation indicated for each):
# HPL (High Performance
LINPACK) – measures performance of a solver for a dense
system of linear equations (global).
#
DGEMM – measures performance for matrix-matrix multiplication (single, star).
# STREAM – measures sustained
memory bandwidth
Memory bandwidth is the rate at which data can be read from or stored into a semiconductor memory by a processor. Memory bandwidth is usually expressed in units of bytes/second, though this can vary for systems with natural data sizes that are n ...
to/from memory (single, star).
# PTRANS – measures the rate at which the system can
transpose
In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal;
that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations).
The tr ...
a large array (global).
#
RandomAccess – measures the rate of 64-bit updates to randomly selected elements of a large table (single, star, global).
# FFT – performs a
Fast Fourier Transform
A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
on a large one-dimensional vector using the generalized
Cooley–Tukey algorithm (single, star, global).
# Communication Bandwidth and Latency – MPI-centric performance measurements based on the b_eff bandwidth/latency benchmark.
Performance attributes
At a high level, the tests are intended to provide coverage of four important attributes of performance: double-precision floating-point arithmetic (DGEMM and HPL), local memory bandwidth (STREAM), network bandwidth for "large" messages (PTRANS, RandomAccess, FFT, b_eff), and network bandwidth for "small" messages (RandomAccess, b_eff). Some of the codes are more complex than others and can have additional performance sensitivities. For example, in some systems HPL performance can be limited by network bandwidth and/or network latency.
Competition
The annual HPC Challenge Award Competition at the
Supercomputing Conference
SC (formerly Supercomputing), the International Conference for High Performance Computing, Networking, Storage and Analysis, is the annual conference established in 1988 by the Association for Computing Machinery and the IEEE Computer Society. In ...
focuses on four of the most challenging benchmarks in the suite:
* Global HPL
* Global
RandomAccess (OR
BSS Random Access Benchmark)
* EP STREAM (Triad) per system
* Global FFT
There are two classes of awards:
* Class 1: Best performance on a base or optimized run submitted to the HPC Challenge website.
* Class 2: Most "elegant" implementation of four or five computational kernels including three or more of the HPC Challenge benchmarks.
See also
{{Portal, Free and open-source software
*
Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
References
External links
HPC Challenge Benchmark Official WebsiteHPC Challenge Award Competition Official WebsiteBSS Random Access BenchmarkPerformance Evaluation and Optimization of Random Memory Access on Multicores with High Productivity (Best Paper Award) a
ACM/IEEE HiPC 2010
Supercomputer benchmarks
Software using the BSD license