The Info List - Address Generation Unit

--- Advertisement ---

Address generation unit
Address generation unit
(AGU), sometimes also called address computation unit (ACU),[1] is an execution unit inside central processing units (CPUs) that calculates addresses used by the CPU to access main memory. By having address calculations handled by separate circuitry that operates in parallel with the rest of the CPU, the number of CPU cycles required for executing various machine instructions can be reduced, bringing performance improvements.[2][3] While performing various operations, CPUs need to calculate memory addresses required for fetching data from the memory; for example, in-memory positions of array elements must be calculated before the CPU can fetch the data from actual memory locations. Those address-generation calculations involve different integer arithmetic operations, such as addition, subtraction, modulo operations, or bit shifts. Often, calculating a memory address involves more than one general-purpose machine instruction, which do not necessarily decode and execute quickly. By incorporating an AGU into a CPU design, together with introducing specialized instructions that use the AGU, various address-generation calculations can be offloaded from the rest of the CPU, and can often be executed quickly in a single CPU cycle.[2][3] Capabilities of an AGU depend on a particular CPU and its architecture. Thus, some AGUs implement and expose more address-calculation operations, while some also include more advanced specialized instructions that can operate on multiple operands at a time.[2][3] Furthermore, some CPU architectures include multiple AGUs so more than one address-calculation operation can be executed simultaneously, bringing further performance improvements by capitalizing on the superscalar nature of advanced CPU designs. For example, Intel
incorporates multiple AGUs into its Sandy Bridge and Haswell microarchitectures, which increase bandwidth of the CPU memory subsystem by allowing multiple memory-access instructions to be executed in parallel.[4][5][6] See also[edit]

Information technology portal

Arithmetic logic unit – a digital circuit that performs arithmetic and bitwise logical operations on integer binary numbers Bulldozer (microarchitecture) – another CPU microarchitecture that includes multiple AGUs, developed by AMD Register renaming – a technique that reuses CPU registers and avoids unnecessary serialization of program operations Reservation station – a CPU feature that allows results of various operations to be used while bypassing CPU registers


^ Cornelis Van Berkel; Patrick Meuwissen (January 12, 2006). "Address generation unit for a processor (US 2006010255 A1 patent application)". google.com. Retrieved December 8, 2014.  ^ a b c "Chapter 4: Address Generation Unit (DSP56300 Family Manual)" (PDF). ecee.colorado.edu. September 16, 1999. Retrieved December 8, 2014.  ^ a b c Darek Mihocka (December 27, 2000). "Pentium 4: Round 1 – Intel
blows the lead". emulators.com. Retrieved December 8, 2014.  ^ David Kanter (September 25, 2010). "Intel's Sandy Bridge Microarchitecture: Memory Subsystem". realworldtech.com. Retrieved December 8, 2014.  ^ David Kanter (November 13, 2012). "Intel's Haswell CPU Microarchitecture: Haswell Memory Hierarchy". realworldtech.com. Retrieved December 8, 2014.  ^ Per Hammarlund (August 2013). "Fourth-Generation Intel
Core Processor, codenamed Haswell" (PDF). hotchips.org. p. 25. Retrieved December 8, 2014. 

External links[edit]

Wikimedia Commons has media related to Address generation units.

Address generation unit
Address generation unit
in the Motorola DSP56K family, June 2003, Motorola A new approach to design of an AGU in a DSP processor, November 2011, by Kabiraj Sethi and Rutuparna Panda Address generation unit
Address generation unit
in DSP applications, September 2013, by Andreas Ehliar Computer
Science from the Bottom Up, Chapter 3. Computer
Architecture, September 2013, by Ian Wienand

v t e

CPU technologies


Turing machine Post–Turing machine Universal Turing machine Quantum Turing machine Belt machine Stack machine Register machine Counter machine Pointer machine Random access machine Random access stored program machine Finite-state machine Queue automaton Von Neumann Harvard (modified) Dataflow TTA Cellular Artificial neural network

Machine learning Deep learning Neural processing unit (NPU)

Convolutional neural network Load/store architecture Register memory architecture Endianness FIFO Zero-copy NUMA HUMA HSA Mobile computing Surface computing Wearable computing Heterogeneous computing Parallel computing Concurrent computing Distributed computing Cloud computing Amorphous computing Ubiquitous computing Fabric computing Cognitive computing Unconventional computing Hypercomputation Quantum computing Adiabatic quantum computing Linear optical quantum computing Reversible computing Reverse computation Reconfigurable computing Optical computing Ternary computer Analogous computing Mechanical computing Hybrid computing Digital computing DNA computing Peptide computing Chemical computing Organic computing Wetware computing Neuromorphic computing Symmetric multiprocessing
Symmetric multiprocessing
(SMP) Asymmetric multiprocessing
Asymmetric multiprocessing
(AMP) Cache hierarchy Memory hierarchy

ISA types



x86 z/Architecture ARM MIPS Power Architecture
Power Architecture
(PowerPC) SPARC Mill Itanium
(IA-64) Alpha Prism SuperH V850 Clipper VAX Unicore PA-RISC MicroBlaze RISC-V

Word size

1-bit 2-bit 4-bit 8-bit 9-bit 10-bit 12-bit 15-bit 16-bit 18-bit 22-bit 24-bit 25-bit 26-bit 27-bit 31-bit 32-bit 33-bit 34-bit 36-bit 39-bit 40-bit 48-bit 50-bit 60-bit 64-bit 128-bit 256-bit 512-bit Variable


Instruction pipelining

Bubble Operand forwarding

Out-of-order execution

Register renaming

Speculative execution

Branch predictor Memory dependence prediction


Parallel level


Bit-serial Word

Instruction Pipelining

Scalar Superscalar


Thread Process





Temporal Simultaneous (SMT) (Hyper-threading) Speculative (SpMT) Preemptive Cooperative Clustered Multi-Thread (CMT) Hardware scout

Flynn's taxonomy



Addressing mode

CPU performance

Instructions per second (IPS) Instructions per clock (IPC) Cycles per instruction (CPI) Floating-point operations per second (FLOPS) Transactions per second (TPS) Synaptic Updates Per Second (SUPS) Performance per watt Orders of magnitude (computing) Cache performance measurement and metric

Core count

Single-core processor Multi-core processor Manycore processor


Central processing unit
Central processing unit
(CPU) GPGPU AI accelerator Vision processing unit (VPU) Vector processor Barrel processor Stream processor Digital signal processor
Digital signal processor
(DSP) I/O processor/DMA controller Network processor Baseband processor Physics processing unit
Physics processing unit
(PPU) Coprocessor Secure cryptoprocessor ASIC FPGA FPOA CPLD Microcontroller Microprocessor Mobile processor Notebook processor Ultra-low-voltage processor Multi-core processor Manycore processor Tile processor Multi-chip module
Multi-chip module
(MCM) Chip stack multi-chip modules System on a chip
System on a chip
(SoC) Multiprocessor system-on-chip (MPSoC) Programmable System-on-Chip
(PSoC) Network on a chip (NoC)


Execution unit (EU) Arithmetic logic unit
Arithmetic logic unit
(ALU) Address generation unit
Address generation unit
(AGU) Floating-point unit
Floating-point unit
(FPU) Load-store unit (LSU) Branch predictor Unified Reservation Station Barrel shifter Uncore Sum addressed decoder (SAD) Front-side bus Back-side bus Northbridge (computing) Southbridge (computing) Adder (electronics) Binary multiplier Binary decoder Address decoder Multiplexer Demultiplexer Registers Cache Memory management unit
Memory management unit
(MMU) Input–output memory management unit
Input–output memory management unit
(IOMMU) Integrated Memory Controller (IMC) Power Management Unit (PMU) Translation lookaside buffer
Translation lookaside buffer
(TLB) Stack engine Register file Processor register Hardware register Memory buffer register (MBR) Program counter Microcode
ROM Datapath Control unit Instruction unit Re-order buffer Data buffer Write buffer Coprocessor Electronic switch Electronic circuit Integrated circuit Three-dimensional integrated circuit Boolean circuit Digital circuit Analog circuit Mixed-signal integrated circuit Power management integrated circuit Quantum circuit Logic gate

Combinational logic Sequential logic Emitter-coupled logic
Emitter-coupled logic
(ECL) Transistor–transistor logic
Transistor–transistor logic
(TTL) Glue logic

Quantum gate Gate array Counter (digital) Bus (computing) Semiconductor device Clock rate CPU multiplier Vision chip Memristor

Power management

APM ACPI Dynamic frequency scaling Dynamic voltage scaling Clock gating

Hardware security

Non-executable memory (NX bit) Memory Protection Extensions ( Intel
MPX) Intel
Secure Key Hardware restriction (firmware) Software Guard Extensions ( Intel
SGX) Trusted Execution Technology Trusted Platform Module
Trusted Platform Module
(TPM) Secure cryptoprocessor Hardware security module Hengzhi chip


History of general-purpose CPUs

v t e

Basic computer components

Input devices

Keyboard Image scanner Microphone Pointing device

Graphics tablet Joystick Light pen Mouse


Pointing stick Touchpad Touchscreen Trackball



Refreshable braille display

Output devices

Monitor Refreshable braille display Printer Speakers Plotter

Removable data storage

Optical disc

CD DVD Blu-ray

Disk pack Floppy disk Memory card USB
flash drive


Central processing unit
Central processing unit
(CPU) HDD / SSD / SSHD Motherboard Network interface controller Power supply Random-access memory
Random-access memory
(RAM) Sound card Video card Fax modem Expansion card


Ethernet FireWire (IEEE 1394) Parallel port Serial port PS/2 port USB Thunderbolt HDMI
/ DVI / VG