HOME
The Info List - Instructions Per Clock


--- Advertisement ---



In computer architecture, instructions per cycle (IPC) is one aspect of a processor's performance: the average number of instructions executed for each clock cycle. It is the multiplicative inverse of cycles per instruction.[1]

Contents

1 Explanation

1.1 Calculation of IPC 1.2 Factors governing IPC

2 Instructions per cycle for various processors 3 Computer speed 4 See also 5 References

Explanation[edit] Calculation of IPC[edit] The number of instructions per second and floating point operations per second for a processor can be derived by multiplying the number of instructions per cycle with the clock rate (cycles per second given in Hertz) of the processor in question. The number of instructions per second is an approximate indicator of the likely performance of the processor. The number of instructions executed per clock is not a constant for a given processor; it depends on how the particular software being run interacts with the processor, and indeed the entire machine, particularly the memory hierarchy. However, certain processor features tend to lead to designs that have higher-than-average IPC values; the presence of multiple arithmetic logic units (an ALU is a processor subsystem that can perform elementary arithmetic and logical operations), and short pipelines. When comparing different instruction sets, a simpler instruction set may lead to a higher IPC figure than an implementation of a more complex instruction set using the same chip technology; however, the more complex instruction set may be able to achieve more useful work with fewer instructions. Factors governing IPC[edit]

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. (July 2017) (Learn how and when to remove this template message)

A given level of instructions per second can be achieved with a high IPC and a low clock speed (like the AMD
AMD
Athlon
Athlon
and early Intel's Core Series), or from a low IPC and high clock speed (like the Intel Pentium 4
Pentium 4
and to a lesser extent the AMD
AMD
Bulldozer). Both are valid processor designs, and the choice between the two is often dictated by history, engineering constraints, or marketing pressures.[original research?] However high IPC with high frequency gives the best performance. Instructions per cycle for various processors[edit] These numbers are NOT the IPC value of these CPUs. They represent the theoretical possible Floating Point performance. Note that the numbers below only represent the logical widths of the processor's SIMD
SIMD
units. They do not account for the multiple SIMD
SIMD
pipes present in most architectures, nor do they represent the primary architectural definition of IPC, which measures the number of average scalar instructions retired per cycle, both integer, floating point, and control. The author seems to have confused FLOPS and IPC, and does not have deep knowledge of the architecture of these CPUs. To get a theoretical GFLOPS (Billions of FLOPS) rating for a given CPU, multiply the number in this chart by the number of cores and then by the stock clock (in GHz) of a particular CPU model. For example, a Coffee Lake i7-8700K theoretically handles 32 Single-Precision floats per cycle, has 6 cores and a 3.6 GHz base clock. This gives it 32 x 6 x 3.7 = 710.4 GFLOPS.

This section needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (July 2017) (Learn how and when to remove this template message)

CPU Family Dual precision Single precision

Intel Core
Intel Core
and Intel
Intel
Nehalem (Harpertown?) 4 IPC 8 SP IPC

Intel
Intel
Sandy Bridge and Intel
Intel
Ivy Bridge 8 DP IPC 16 SP IPC

Intel
Intel
Haswell (and Devil's Canyon?), Intel
Intel
Broadwell, Intel
Intel
Skylake, Intel
Intel
Kaby Lake and Intel
Intel
Coffee Lake 16 DP IPC 32 SP IPC

Intel
Intel
Xeon Skylake (AVX-512) 32 DP IPC 64 SP IPC

AMD
AMD
K10 6 DP IPC 12 SP IPC

AMD
AMD
Bulldozer, AMD
AMD
Piledriver and AMD
AMD
Steamroller per module (two cores) 12 DP IPC 24 SP IPC

AMD
AMD
Ryzen and AMD
AMD
Ryzen 2 16 DP IPC 32 SP IPC

Intel
Intel
Atom (Bonnell, Saltwell, Silvermont and Goldmont) 2 DP IPC 4 SP IPC

AMD
AMD
Bobcat 2 DP IPC 4 SP IPC

AMD
AMD
Jaguar and Puma 4 DP IPC 8 SP IPC

ARM Cortex-A7 1 IPC 8 SP IPC

ARM Cortex-A9 1 IPC 8 SP IPC

ARM Cortex-A15 1 DP IPC 8 SP IPC

ARM Cortex-A32 2 DP IPC 8 SP IPC

ARM Cortex-A35 2 DP IPC 8 SP IPC

ARM Cortex-A53 2 DP IPC 8 SP CPI

ARM Cortex-A57 2 DP IPC 8 SP IPC

ARM Cortex-A72 2 DP IPC 8 SP IPC

Qualcomm Krait 1 DP IPC 8 SP IPC

Qualcomm Kryo 2 DP IPC 8 SP IPC

IBM PowerPC A2 (Blue Gene/Q), per core 8 DP IPC SP elements are extend- ed to DP and processed on the same units

IBM PowerPC A2 (Blue Gene/Q), per thread 4 DP IPC

Intel
Intel
Xeon Phi (Knights Corner), per core 16 DP IPC 32 SP IPC

Intel
Intel
Xeon Phi (Knights Corner), per thread (4 per core) 8 DP IPC 16 SP IPC

Standard GPU Different 2 SP IPC

Generally, large of processor register shows how big numbers core of processor can count one time. Number of registers is important too, because they can connect together for a moment with some instructions. Computer speed[edit] The useful work that can be done with any computer depends on many factors besides the processor speed. These factors include the instruction set architecture, the processor's microarchitecture, and the computer system organization (such as the design of the disk storage system and the capabilities and performance of other attached devices), the efficiency of the operating system, and most importantly the high-level design of the application software in use. For users and purchasers of a computer system, instructions per clock is not a particularly useful indication of the performance of their system. For an accurate measure of performance relevant to them, application benchmarks are much more useful. Awareness of its existence is useful, in that it provides an easy-to-grasp example of why clock speed is not the only factor relevant to computer performance. See also[edit]

Instructions per second Cycles per instruction FLOPS Megahertz myth The benchmark article provides a useful introduction to computer performance measurement for those readers interested in the topic.

References[edit]

^ John L. Hennessy, David A. Patterson. "Computer architecture: a quantitativ

.