
A floating-point unit (FPU), numeric processing unit (NPU), colloquially math coprocessor, is a part of a
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
system specially designed to carry out operations on
floating-point numbers. Typical operations are
addition
Addition (usually signified by the Plus and minus signs#Plus sign, plus symbol, +) is one of the four basic Operation (mathematics), operations of arithmetic, the other three being subtraction, multiplication, and Division (mathematics), divis ...
,
subtraction,
multiplication
Multiplication is one of the four elementary mathematical operations of arithmetic, with the other ones being addition, subtraction, and division (mathematics), division. The result of a multiplication operation is called a ''Product (mathem ...
,
division, and
square root. Modern designs generally include a
fused multiply-add instruction, which was found to be very common in real-world code. Some FPUs can also perform various
transcendental functions such as
exponential or
trigonometric calculations, but the accuracy can be low, so some systems prefer to compute these functions in software.
Floating-point operations were originally handled in
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
in early computers. Over time, manufacturers began to provide standardized floating-point libraries as part of their software collections. Some machines, those dedicated to scientific processing, would include specialized hardware to perform some of these tasks with much greater speed. The introduction of
microcode
In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...
in the 1960s allowed these instructions to be included in the system's
instruction set architecture (ISA). Normally these would be decoded by the microcode into a series of instructions that were similar to the libraries, but on those machines with an FPU, they would instead be routed to that unit, which would perform them much faster. This allowed floating-point instructions to become universal while the floating-point hardware remained optional; for instance, on the
PDP-11 one could add the floating-point processor unit at any time using plug-in
expansion cards.
The introduction of the
microprocessor
A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...
in the 1970s led to a similar evolution as the earlier
mainframes and
minicomputers. Early
microcomputer systems performed floating point in software, typically in a vendor-specific library included in
ROM. Dedicated single-chip FPUs began to appear late in the decade, but they remained rare in real-world systems until the mid-1980s, and using them required software to be re-written to call them. As they became more common, the software libraries were modified to work like the microcode of earlier machines, performing the instructions on the main CPU if needed, but offloading them to the FPU if one was present. By the late 1980s,
semiconductor manufacturing had improved to the point where it became possible to include an FPU with the main CPU, resulting in designs like the
i486 and
68040. These designs were known as an "integrated FPU"s, and from the mid-1990s, FPUs were a standard feature of most CPU designs except those designed as low-cost as
embedded processors.
In modern designs, a single CPU will typically include several
arithmetic logic units (ALUs) and several FPUs, reading many instructions at the same time and routing them to the various units for parallel execution. By the 2000s, even embedded processors generally included an FPU as well.
History
In 1954, the
IBM 704 had floating-point arithmetic as a standard feature, one of its major improvements over its predecessor the
IBM 701. This was carried forward to its successors the 709, 7090, and 7094.
In 1963, Digital announced the
PDP-6, which had floating point as a standard feature.
In 1963, the
GE-235 featured an "Auxiliary Arithmetic Unit" for floating point and double-precision calculations.
Historically, some systems implemented
floating point with a
coprocessor rather than as an integrated unit (but now in addition to the CPU, e.g.
GPUsthat are coprocessors not always built into the CPUhave FPUs as a rule, while first generations of GPUs did not). This could be a single
integrated circuit
An integrated circuit (IC), also known as a microchip or simply chip, is a set of electronic circuits, consisting of various electronic components (such as transistors, resistors, and capacitors) and their interconnections. These components a ...
, an entire
circuit board or a cabinet. Where floating-point calculation hardware has not been provided, floating-point calculations are done in software, which takes more processor time, but avoids the cost of the extra hardware. For a particular computer architecture, the floating-point unit instructions may be
emulated by a library of software functions; this may permit the same
object code to run on systems with or without floating-point hardware. Emulation can be implemented on any of several levels: in the CPU as
microcode
In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...
, as an
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
function, or in
user-space code. When only integer functionality is available, the
CORDIC methods are most commonly used for
transcendental function evaluation.
In most modern computer architectures, there is some division of floating-point operations from
integer
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
operations. This division varies significantly by architecture; some have dedicated floating-point registers, while some, like
Intel x86, go as far as independent
clocking schemes.
CORDIC routines have been implemented in
Intel x87 coprocessors (
8087,
80287,
80387
) up to the
80486 microprocessor series, as well as in the
Motorola 68881 and 68882 for some kinds of floating-point instructions, mainly as a way to reduce the
gate
A gate or gateway is a point of entry to or from a space enclosed by walls. The word is derived from Proto-Germanic language, Proto-Germanic ''*gatan'', meaning an opening or passageway. Synonyms include yett (which comes from the same root w ...
counts (and complexity) of the FPU subsystem.
Floating-point operations are often
pipelined. In earlier
superscalar architectures without general
out-of-order execution, floating-point operations were sometimes pipelined separately from integer operations.
The modular architecture of
Bulldozer microarchitecture uses a special FPU named FlexFPU, which uses
simultaneous multithreading. Each physical integer core, two per module, is single-threaded, in contrast with Intel's
Hyperthreading, where two virtual simultaneous threads share the resources of a single physical core.
Floating-point library
Some floating-point hardware only supports the simplest operations: addition, subtraction, and multiplication. But even the most complex floating-point hardware has a finite number of operations it can supportfor example, no FPUs directly support
arbitrary-precision arithmetic.
When a CPU is executing a program that calls for a floating-point operation that is not directly supported by the hardware, the CPU uses a series of simpler floating-point operations. In systems without any floating-point hardware, the CPU
emulates it using a series of simpler
fixed-point arithmetic operations that run on the integer
arithmetic logic unit.
The software that lists the necessary series of operations to emulate floating-point operations is often packaged in a floating-point
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
.
Integrated FPUs
In some cases, FPUs may be specialized, and divided between simpler floating-point operations (mainly addition and multiplication) and more complicated operations, like division. In some cases, only the simple operations may be implemented in hardware or
microcode
In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...
, while the more complex operations are implemented as software.
In some current architectures, the FPU functionality is combined with
SIMD units to perform SIMD computation; an example of this is the augmentation of the
x87 instructions set with
SSE instruction set in the
x86-64 architecture used in newer Intel and AMD processors.
Add-on FPUs
Several models of the
PDP-11, such as the PDP-11/45, PDP-11/34a,
PDP-11/44,
and PDP-11/70,
supported an add-on floating-point unit to support floating-point instructions. The PDP-11/60,
MicroPDP-11/23
and several
VAX models could execute floating-point instructions without an add-on FPU (the MicroPDP-11/23 required an add-on microcode option),
and offered add-on accelerators to further speed the execution of those instructions.
In the 1980s, it was common in
IBM PC
The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the List of IBM Personal Computer models, IBM PC model line and the basis for the IBM PC compatible ''de facto'' standard. Released on ...
/compatible
microcomputers for the FPU to be entirely separate from the
CPU, and typically sold as an optional add-on. It would only be purchased if needed to speed up or enable math-intensive programs.
The IBM PC,
XT, and most compatibles based on the 8088 or 8086 had a socket for the optional 8087 coprocessor. The
AT and
80286-based systems were generally socketed for the
80287, and
80386/80386SX-based machinesfor the
80387 and
80387SX respectively, although early ones were socketed for the 80287, since the 80387 did not exist yet. Other companies manufactured co-processors for the Intel x86 series. These included
Cyrix and
Weitek.
Acorn Computers opted for the WE32206 to offer
single,
double and
extended precision to its
ARM powered
Archimedes
Archimedes of Syracuse ( ; ) was an Ancient Greece, Ancient Greek Greek mathematics, mathematician, physicist, engineer, astronomer, and Invention, inventor from the ancient city of Syracuse, Sicily, Syracuse in History of Greek and Hellenis ...
range, introducing a gate array to interface the ARM2 processor with the WE32206 to support the additional ARM floating-point instructions.
Acorn later offered the FPA10 coprocessor, developed by ARM, for various machines fitted with the ARM3 processor.
Coprocessors were available for the
Motorola 68000 family, the
68881 and 68882. These were common in
Motorola 68020/
68030-based
workstations, like the
Sun-3 series. They were also commonly added to higher-end models of Apple
Macintosh
Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...
and Commodore
Amiga series, but unlike IBM PC-compatible systems, sockets for adding the coprocessor were not as common in lower-end systems.
There are also add-on FPU coprocessor units for
microcontroller units (MCUs/μCs)/
single-board computer (SBCs), which serve to provide floating-point
arithmetic capability. These add-on FPUs are host-processor-independent, possess their own programming requirements (
operations,
instruction set
In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
s, etc.) and are often provided with their own
integrated development environment
An integrated development environment (IDE) is a Application software, software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, an ...
s (IDEs).
See also
*
Arithmetic logic unit (ALU)
*
Address generation unit (AGU)
*
Load–store unit
*
CORDIC routines are used in many FPUs to implement functions but not greatly increase gate count
*
Execution unit
*
IEEE 754
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic originally established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard #Design rationale, add ...
floating-point standard
*
IBM hexadecimal floating point
*
Graphics processing unit
*
Multiply–accumulate operation
References
Further reading
*
{{CPU technologies
Central processing unit
Computer arithmetic
Coprocessors
Floating point