
In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, an arithmetic logic unit (ALU) is a
combinational
In automata theory, combinational logic (also referred to as time-independent logic) is a type of digital logic that is implemented by Boolean circuits, where the output is a pure function of the present input only. This is in contrast to seque ...
digital circuit
In theoretical computer science, a circuit is a model of computation in which input values proceed through a sequence of gates, each of which computes a function. Circuits of this kind provide a generalization of Boolean circuits and a mathematica ...
that performs
arithmetic
Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and division. In a wider sense, it also includes exponentiation, extraction of roots, and taking logarithms.
...
and
bitwise operation
In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operatio ...
s on
integer
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
binary number
A binary number is a number expressed in the Radix, base-2 numeral system or binary numeral system, a method for representing numbers that uses only two symbols for the natural numbers: typically "0" (zero) and "1" (one). A ''binary number'' may ...
s.
This is in contrast to a
floating-point unit
A floating-point unit (FPU), numeric processing unit (NPU), colloquially math coprocessor, is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtraction, multip ...
(FPU), which operates on
floating point
In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a signed sequence of a fixed number of digits in some base) multiplied by an integer power of that base.
Numbers of this form ...
numbers. It is a fundamental building block of many types of computing circuits, including the
central processing unit
A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
(CPU) of computers, FPUs, and
graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
s (GPUs).
The inputs to an ALU are the data to be operated on, called
operand
In mathematics, an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on.
Unknown operands in equalities of expressions can be found by equation solving.
Example
The following arithmetic expres ...
s, and a code indicating the operation to be performed (
opcode
In computing, an opcode (abbreviated from operation code) is an enumerated value that specifies the operation to be performed. Opcodes are employed in hardware devices such as arithmetic logic units (ALUs), central processing units (CPUs), and ...
); the ALU's output is the result of the performed operation. In many designs, the ALU also has status inputs or outputs, or both, which convey information about a previous operation or the current operation, respectively, between the ALU and external
status register
A status register, flag register, or condition code register (CCR) is a collection of status Flag (computing), flag bits for a Central processing unit, processor. Examples of such registers include FLAGS register (computing), FLAGS register in the ...
s.
Signals
An ALU has a variety of input and output
nets, which are the
electrical conductors
In physics and electrical engineering, a conductor is an object or type of material that allows the flow of charge (electric current) in one or more directions. Materials made of metal are common electrical conductors. The flow of negatively c ...
used to convey
digital signal
A digital signal is a signal that represents data as a sequence of discrete values; at any given time it can only take on, at most, one of a finite number of values. This contrasts with an analog signal, which represents continuous values; ...
s between the ALU and external circuitry. When an ALU is operating, external circuits apply signals to the ALU inputs and, in response, the ALU produces and conveys signals to external circuitry via its outputs.
Data
A basic ALU has three parallel data
buses
A bus (contracted from omnibus, with variants multibus, motorbus, autobus, etc.) is a motor vehicle that carries significantly more passengers than an average car or van, but fewer than the average rail transport. It is most commonly used ...
consisting of two input
operand
In mathematics, an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on.
Unknown operands in equalities of expressions can be found by equation solving.
Example
The following arithmetic expres ...
s (''A'' and ''B'') and a result output (''Y''). Each data bus is a group of signals that conveys one binary integer number. Typically, the A, B and Y bus widths (the number of signals comprising each bus) are identical and match the native
word size of the external circuitry (e.g., the encapsulating CPU or other processor).
Opcode
The ''
opcode
In computing, an opcode (abbreviated from operation code) is an enumerated value that specifies the operation to be performed. Opcodes are employed in hardware devices such as arithmetic logic units (ALUs), central processing units (CPUs), and ...
'' input is a parallel bus that conveys to the ALU an operation selection code, which is an
enumerated value that specifies the desired arithmetic or logic operation to be performed by the ALU. The opcode size (its bus width) determines the maximum number of distinct operations the ALU can perform; for example, a four-bit opcode can specify up to sixteen different ALU operations. Generally, an ALU opcode is not the same as a
machine language instruction, though in some cases it may be directly encoded as a bit field within such instructions.
Status
Outputs
The status outputs are various individual signals that convey supplemental information about the result of the current ALU operation. General-purpose ALUs commonly have status signals such as:
* ''Carry-out'', which conveys the
carry resulting from an addition operation, the borrow resulting from a subtraction operation, or the overflow bit resulting from a binary shift operation.
* ''Zero'', which indicates all bits of Y are logic zero.
* ''Negative'', which indicates the result of an arithmetic operation is negative.
* ''
Overflow'', which indicates the result of an arithmetic operation has exceeded the numeric range of Y.
* ''
Parity'', which indicates whether an even or odd number of bits in Y are logic one.
Inputs
The status inputs allow additional information to be made available to the ALU when performing an operation. Typically, this is a single "carry-in" bit that is the stored carry-out from a previous ALU operation.
Circuit operation

An ALU is a
combinational logic circuit, meaning that its outputs will change asynchronously in response to input changes. In normal operation, stable signals are applied to all of the ALU inputs and, when enough time (known as the "
propagation delay
Propagation delay is the time duration taken for a signal to reach its destination, for example in the electromagnetic field, a wire, speed of sound, gas, fluid or seismic wave, solid body.
Physics
* An electromagnetic wave travelling through ...
") has passed for the signals to propagate through the ALU circuitry, the result of the ALU operation appears at the ALU outputs. The external circuitry connected to the ALU is responsible for ensuring the stability of ALU input signals throughout the operation, and for allowing sufficient time for the signals to propagate through the ALU circuitry before sampling the ALU outputs.
In general, external circuitry controls an ALU by applying signals to the ALU inputs. Typically, the external circuitry employs
sequential logic to generate the signals that control ALU operation. The external sequential logic is paced by a
clock signal
In electronics and especially synchronous digital circuits, a clock signal (historically also known as ''logic beat'') is an electronic logic signal (voltage or current) which oscillates between a high and a low state at a constant frequency and ...
of sufficiently low frequency to ensure enough time for the ALU outputs to settle under worst-case conditions (i.e., conditions resulting in the maximum possible propagation delay).
For example, a CPU starts an addition operation by routing the operands from their sources (typically
processor register
A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-onl ...
s) to the ALU's operand inputs, while simultaneously applying a value to the ALU's opcode input that configures it to perform an addition operation. At the same time, the CPU enables the destination register to store the ALU output (the resulting sum from the addition operation) upon operation completion. The ALU's input signals, which are held stable until the next clock, are allowed to propagate through the ALU and to the destination register while the CPU waits for the next clock. When the next clock arrives, the destination register stores the ALU result and, since the ALU operation has completed, the ALU inputs may be set up for the next ALU operation.
Functions
A number of basic arithmetic and bitwise logic functions are commonly supported by ALUs. Basic, general purpose ALUs typically include these operations in their repertoires:
Arithmetic operations
* ''
Add'': A and B are summed and the sum appears at Y and carry-out.
* ''Add with carry'': A, B and carry-in are summed and the sum appears at Y and carry-out.
* ''
Subtract'': B is subtracted from A (or vice versa) and the difference appears at Y and carry-out. For this function, carry-out is effectively a "borrow" indicator. This operation may also be used to compare the magnitudes of A and B; in such cases the Y output may be ignored by the processor, which is only interested in the status bits (particularly zero and negative) that result from the operation.
* ''Subtract with borrow'': B is subtracted from A (or vice versa) with borrow (carry-in) and the difference appears at Y and carry-out (borrow out).
* ''
Two's complement
Two's complement is the most common method of representing signed (positive, negative, and zero) integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the ''greatest'' value as the ''s ...
'': A (or B) is subtracted from zero and the difference appears at Y.
* ''Increment'': A (or B) is increased by one and the resulting value appears at Y.
* ''Decrement'': A (or B) is decreased by one and the resulting value appears at Y.
Bitwise logical operations
* ''
AND'': the bitwise AND of A and B appears at Y.
* ''
OR'': the bitwise OR of A and B appears at Y.
* ''
Exclusive-OR
Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ (one ...
'': the bitwise XOR of A and B appears at Y.
* ''
Ones' complement
The ones' complement of a binary number is the value obtained by inverting (flipping) all the bits in the Binary number, binary representation of the number. The name "ones' complement" refers to the fact that such an inverted value, if added t ...
'': all bits of A (or B) are inverted and appear at Y.
Bit shift operations
ALU shift operations cause operand A (or B) to shift left or right (depending on the opcode) and the shifted operand appears at Y. Simple ALUs typically can shift the operand by only one bit position, whereas more complex ALUs employ
barrel shifter
A barrel shifter is a digital circuit that can bit shift, shift a word (data type), data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. I ...
s that allow them to shift the operand by an arbitrary number of bits in one operation. In all single-bit shift operations, the bit shifted out of the operand appears on carry-out; the value of the bit shifted into the operand depends on the type of shift.
* ''
Arithmetic shift'': the operand is treated as a
two's complement
Two's complement is the most common method of representing signed (positive, negative, and zero) integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the ''greatest'' value as the ''s ...
integer, meaning that the most significant bit is a "sign" bit and is preserved.
* ''
Logical shift
In computer science, a logical shift is a bitwise operation that shifts all the bits of its operand. The two base variants are the logical left shift and the logical right shift. This is further modulated by the number of bit positions a given v ...
'': a logic zero is shifted into the operand. This is used to shift unsigned integers.
* ''
Rotate'': the operand is treated as a
circular buffer
In computer science, a circular buffer, circular queue, cyclic buffer or ring buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end. This structure lends itself easily to buffering data streams. The ...
of bits in which its least and most significant bits are effectively adjacent.
* ''
Rotate through carry'': the carry bit and operand are collectively treated as a circular buffer of bits.
Other operations
* ''Pass through'': all bits of A (or B) appear unmodified at Y. This operation is typically used to determine the parity of the operand or whether it is zero or negative, or to copy the operand to a processor register.
Applications
Status usage

Upon completion of each ALU operation, the ALU's status output signals are usually stored in external registers to make them available for future ALU operations (e.g., to implement
multiple-precision arithmetic) and for controlling
conditional branching. The bit registers that store the status output signals are often collectively treated as a single, multi-bit register, which is referred to as the "status register" or "condition code register".
Depending on the ALU operation being performed, some status register bits may be changed and others may be left unmodified. For example, in bitwise logical operations such as AND and OR, the carry status bit is typically not modified as it is not relevant to such operations.
In CPUs, the stored carry-out signal is usually connected to the ALU's carry-in net. This facilitates efficient propagation of carries (which may represent addition carries, subtraction borrows, or shift overflows) when performing multiple-precision operations, as it eliminates the need for software-management of carry propagation (via conditional branching, based on the carry status bit).
Operand and result data paths

The sources of ALU operands and destinations of ALU results depend on the architecture of the encapsulating processor and the operation being performed. Processor architectures vary widely, but in general-purpose CPUs, the ALU typically operates in conjunction with a
register file
A register file is an array of processor registers in a central processing unit (CPU). The instruction set architecture of a CPU will almost always define a set of registers which are used to stage data between memory and the functional units on ...
(array of processor registers) or
accumulator register, which the ALU frequently uses as both a source of operands and a destination for results. To accommodate other operand sources, multiplexers are commonly used to select either the register file or alternative ALU operand sources as required by each machine instruction.
For example, the architecture shown to the right employs a register file with two read ports, which allows the values stored in any two registers (or the same register) to be ALU operands. Alternatively, it allows either ALU operand to be sourced from an ''immediate operand'' (a constant value which is directly encoded in the machine instruction) or from memory. The ALU result may be written to any register in the register file or to memory.
Multiple-precision arithmetic
In integer arithmetic computations, multiple-precision arithmetic is an algorithm that operates on integers which are larger than the ALU word size. To do this, the algorithm treats each integer as an ordered collection of ALU-size fragments, arranged from most-significant (MS) to least-significant (LS) or vice versa. For example, in the case of an 8-bit ALU, the 24-bit integer
0x123456
would be treated as a collection of three 8-bit fragments:
0x12
(MS),
0x34
, and
0x56
(LS). Since the size of a fragment exactly matches the ALU word size, the ALU can directly operate on this "piece" of operand.
The algorithm uses the ALU to directly operate on particular operand fragments and thus generate a corresponding fragment (a "partial") of the multi-precision result. Each partial, when generated, is written to an associated region of storage that has been designated for the multiple-precision result. This process is repeated for all operand fragments so as to generate a complete collection of partials, which is the result of the multiple-precision operation.
In arithmetic operations (e.g., addition, subtraction), the algorithm starts by invoking an ALU operation on the operands' LS fragments, thereby producing both a LS partial and a carry out bit. The algorithm writes the partial to designated storage, whereas the processor's state machine typically stores the carry out bit to an ALU status register. The algorithm then advances to the next fragment of each operand's collection and invokes an ALU operation on these fragments along with the stored carry bit from the previous ALU operation, thus producing another (more significant) partial and a carry out bit. As before, the carry bit is stored to the status register and the partial is written to designated storage. This process repeats until all operand fragments have been processed, resulting in a complete collection of partials in storage, which comprise the multi-precision arithmetic result.
In multiple-precision shift operations, the order of operand fragment processing depends on the shift direction. In left-shift operations, fragments are processed LS first because the LS bit of each partial—which is conveyed via the stored carry bit—must be obtained from the MS bit of the previously left-shifted, less-significant operand. Conversely, operands are processed MS first in right-shift operations because the MS bit of each partial must be obtained from the LS bit of the previously right-shifted, more-significant operand.
In bitwise logical operations (e.g., logical AND, logical OR), the operand fragments may be processed in any arbitrary order because each partial depends only on the corresponding operand fragments (the stored carry bit from the previous ALU operation is ignored).
Binary fixed-point addition and subtraction
Binary
fixed-point values are represented by integers. Consequently, for any particular fixed-point scale factor (or implied radix point position), an ALU can directly add or subtract two fixed-point operands and produce a fixed-point result. This capability is commonly used in both fixed-point and floating-point addition and subtraction.
In floating-point addition and subtraction, the
significand
The significand (also coefficient, sometimes argument, or more ambiguously mantissa, fraction, or characteristic) is the first (left) part of a number in scientific notation or related concepts in floating-point representation, consisting of its s ...
of the smaller operand is right-shifted so that its fixed-point scale factor matches that of the larger operand. The ALU then adds or subtracts the aligned significands to produce a result significand. Together with other operand elements, the result significand is normalized and rounded to produce the floating-point result.
Complex operations
Although it is possible to design ALUs that can perform complex functions, this is usually impractical due to the resulting increases in circuit complexity, power consumption, propagation delay, cost and size. Consequently, ALUs are typically limited to simple functions that can be executed at very high speeds (i.e., very short propagation delays), with more complex functions being the responsibility of software or external circuitry. For example:
* In simple cases in which a CPU contains a single ALU, the CPU typically implements a complex operation by orchestrating a sequence of ALU operations according to a software algorithm.
* More specialized architectures may use multiple ALUs to accelerate complex operations. In such systems, the ALUs are often
pipelined, with intermediate results passing through ALUs arranged like a factory
production line
A production line is a set of sequential operations established in a factory where components are assembled to make a finished article or where materials are put through a refining process to produce an end-product that is suitable for onward ...
. Performance is greatly improved over that of a single ALU because all of the ALUs operate concurrently and software overhead is significantly reduced.
Graphics processing units
Graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
s (GPUs) often contain hundreds or thousands of ALUs which can operate concurrently. Depending on the application and GPU architecture, the ALUs may be used to simultaneously process unrelated data or to operate in parallel on related data. An example of the latter is graphics rendering, in which multiple ALUs perform the same operation in parallel on a group of pixels, with each ALU operating on a pixel within a scene.
Implementation
An ALU is usually implemented either as a stand-alone
integrated circuit
An integrated circuit (IC), also known as a microchip or simply chip, is a set of electronic circuits, consisting of various electronic components (such as transistors, resistors, and capacitors) and their interconnections. These components a ...
(IC), such as the
74181, or as part of a more complex IC. In the latter case, an ALU is typically instantiated by synthesizing it from a description written in
VHDL
VHDL (Very High Speed Integrated Circuit Program, VHSIC Hardware Description Language) is a hardware description language that can model the behavior and structure of Digital electronics, digital systems at multiple levels of abstraction, ran ...
,
Verilog
Verilog, standardized as IEEE 1364, is a hardware description language (HDL) used to model electronic systems. It is most commonly used in the design and verification of digital circuits, with the highest level of abstraction being at the re ...
or some other
hardware description language
In computer engineering, a hardware description language (HDL) is a specialized computer language used to describe the structure and behavior of electronic circuits, usually to design application-specific integrated circuits (ASICs) and to progra ...
. For example, the following VHDL code describes a very simple
8-bit
In computer architecture, 8-bit integers or other data units are those that are 8 bits wide (1 octet). Also, 8-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers or data bu ...
ALU:
entity alu is
port ( -- the alu connections to external circuitry:
A : in signed(7 downto 0); -- operand A
B : in signed(7 downto 0); -- operand B
OP : in unsigned(2 downto 0); -- opcode
Y : out signed(7 downto 0)); -- operation result
end alu;
architecture behavioral of alu is
begin
case OP is -- decode the opcode and perform the operation:
when "000" => Y <= A + B; -- add
when "001" => Y <= A - B; -- subtract
when "010" => Y <= A - 1; -- decrement
when "011" => Y <= A + 1; -- increment
when "100" => Y <= not A; -- 1's complement
when "101" => Y <= A and B; -- bitwise AND
when "110" => Y <= A or B; -- bitwise OR
when "111" => Y <= A xor B; -- bitwise XOR
when others => Y <= (others => 'X');
end case;
end behavioral;
History
Mathematician
John von Neumann
John von Neumann ( ; ; December 28, 1903 – February 8, 1957) was a Hungarian and American mathematician, physicist, computer scientist and engineer. Von Neumann had perhaps the widest coverage of any mathematician of his time, in ...
proposed the ALU concept in 1945 in a report on the foundations for a new computer called the
EDVAC
EDVAC (Electronic Discrete Variable Automatic Computer) was one of the earliest electronic computers. It was built by Moore School of Electrical Engineering at the University of Pennsylvania. Along with ORDVAC, it was a successor to the ENIAC. ...
.
The cost, size, and power consumption of electronic circuitry was relatively high throughout the infancy of the
Information Age
The Information Age is a historical period that began in the mid-20th century. It is characterized by a rapid shift from traditional industries, as established during the Industrial Revolution, to an economy centered on information technology ...
. Consequently, all early computers had a
serial ALU that operated on one data bit at a time although they often presented a wider word size to programmers. The first computer to have multiple parallel discrete single-bit ALU circuits was the 1951
Whirlwind I, which employed sixteen such "math units" to enable it to operate on 16-bit words.
In 1967,
Fairchild Fairchild may refer to:
Organizations
* Fairchild Aerial Surveys, operated in cooperation with a subsidiary of Fairey Aviation Company
* Fairchild Camera and Instrument
* List of Sherman Fairchild companies, "Fairchild" companies
* Fairchild ...
introduced the first ALU-like device implemented as an integrated circuit, the Fairchild 3800, consisting of an eight-bit arithmetic unit with accumulator. It only supported adds and subtracts but no logic functions.
Full integrated-circuit ALUs soon emerged, including four-bit ALUs such as the
Am2901 and
74181. These devices were typically "
bit slice
Bit slicing is a technique for constructing a Processor (computing), processor from modules of processors of smaller bit width, for the purpose of increasing the word length; in theory to make an arbitrary ''n''-bit central processing unit ...
" capable, meaning they had "carry look ahead" signals that facilitated the use of multiple interconnected ALU chips to create an ALU with a wider word size. These devices quickly became popular and were widely used in bit-slice minicomputers.
Microprocessors began to appear in the early 1970s. Even though transistors had become smaller, there was sometimes insufficient die space for a full-word-width ALU and, as a result, some early microprocessors employed a narrow ALU that required multiple cycles per machine language instruction. Examples of this includes the popular
Zilog Z80
The Zilog Z80 is an 8-bit computing, 8-bit microprocessor designed by Zilog that played an important role in the evolution of early personal computing. Launched in 1976, it was designed to be Backward compatibility, software-compatible with the ...
, which performed eight-bit additions with a four-bit ALU. Over time, transistor geometries shrank further, following
Moore's law
Moore's law is the observation that the Transistor count, number of transistors in an integrated circuit (IC) doubles about every two years. Moore's law is an observation and Forecasting, projection of a historical trend. Rather than a law of ...
, and it became feasible to build wider ALUs on microprocessors.
Modern integrated circuit (IC) transistors are orders of magnitude smaller than those of the early microprocessors, making it possible to fit highly complex ALUs on ICs. Today, many modern ALUs have wide word widths, and architectural enhancements such as
barrel shifter
A barrel shifter is a digital circuit that can bit shift, shift a word (data type), data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. I ...
s and
binary multipliers that allow them to perform, in a single clock cycle, operations that would have required multiple operations on earlier ALUs.
ALUs can be realized as
mechanical
Mechanical may refer to:
Machine
* Machine (mechanical), a system of mechanisms that shape the actuator input to achieve a specific application of output forces and movement
* Mechanical calculator, a device used to perform the basic operations o ...
,
electro-mechanical or
electronic circuits and, in recent years, research into biological ALUs has been carried out (e.g.,
actin
Actin is a family of globular multi-functional proteins that form microfilaments in the cytoskeleton, and the thin filaments in muscle fibrils. It is found in essentially all eukaryotic cells, where it may be present at a concentration of ...
-based).
See also
*
Adder (electronics)
An adder, or summer, is a digital circuit that performs addition of numbers. In many computers and other kinds of microprocessor, processors, adders are used in the arithmetic logic units (ALUs). They are also used in other parts of the processor, ...
*
Address generation unit (AGU)
*
Binary multiplier
*
Execution unit
In computer engineering, an execution unit (E-unit or EU) is a part of a processing unit that performs the operations and calculations forwarded from the instruction unit. It may have its own internal control sequence unit (not to be confused w ...
*
Load–store unit
*
Status register
A status register, flag register, or condition code register (CCR) is a collection of status Flag (computing), flag bits for a Central processing unit, processor. Examples of such registers include FLAGS register (computing), FLAGS register in the ...
References
Further reading
*
*
External links
{{DEFAULTSORT:Arithmetic Logic Unit
Digital circuits
Central processing unit
Computer arithmetic
Computer architecture
Arithmetic logic circuits