x87 is a

floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can ...

-related subset of the

x86 architecture x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was int ...

instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ...

. It originated as an extension of the 8086 instruction set in the form of optional floating-point

coprocessors A coprocessor is a computer processor used to supplement the functions of the primary processor (the CPU). Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or ...

that worked in tandem with corresponding x86 CPUs. These microchips had names ending in "87". This was also known as the NPX (''Numeric Processor eXtension''). Like other extensions to the basic instruction set, x87 instructions are not strictly needed to construct working programs, but provide hardware and

microcode In processor design, microcode (μcode) is a technique that interposes a layer of computer organization between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. Microcode is a la ...

implementations of common numerical tasks, allowing these tasks to be performed much faster than corresponding

machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...

routines can. The x87 instruction set includes instructions for basic floating-point operations such as addition, subtraction and comparison, but also for more complex numerical operations, such as the computation of the

tangent In geometry, the tangent line (or simply tangent) to a plane curve at a given point is the straight line that "just touches" the curve at that point. Leibniz defined it as the line through a pair of infinitely close points on the curve. Mo ...

function and its inverse, for example. Most x86 processors since the

Intel 80486 The Intel 486, officially named i486 and also known as 80486, is a microprocessor. It is a higher-performance follow-up to the Intel 386. The i486 was introduced in 1989. It represents the fourth generation of binary compatible CPUs following t ...

have had these x87 instructions implemented in the main CPU, but the term is sometimes still used to refer to that part of the instruction set. Before x87 instructions were standard in PCs,

compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...

s or programmers had to use rather slow library calls to perform floating-point operations, a method that is still common in (low-cost)

embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded ...

Description

The x87 registers form an eight-level deep non-strict

stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...

structure ranging from ST(0) to ST(7) with registers that can be directly accessed by either operand, using an offset relative to the top, as well as pushed and popped. (This scheme may be compared to how a

stack frame In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mach ...

may be both pushed/popped and indexed.) There are instructions to push, calculate, and pop values on top of this stack;

unary operation In mathematics, an unary operation is an operation with only one operand, i.e. a single input. This is in contrast to binary operations, which use two operands. An example is any function , where is a set. The function is a unary operation o ...

s (FSQRT, FPTAN etc.) then implicitly address the topmost ST(0), while

binary operations In mathematics, a binary operation or dyadic operation is a rule for combining two elements (called operands) to produce another element. More formally, a binary operation is an operation of arity two. More specifically, an internal binary op ...

(FADD, FMUL, FCOM, etc.) implicitly address ST(0) and ST(1). The non-strict stack model also allows binary operations to use ST(0) together with a direct '' memory operand'' or with an ''explicitly'' specified stack register, ST(''x''), in a role similar to a traditional accumulator (a combined destination and left operand). This can also be reversed on an instruction-by-instruction basis with ST(0) as the unmodified operand and ST(''x'') as the ''destination''. Furthermore, the contents in ST(0) can be exchanged with another stack register using an instruction called FXCH ST(''x''). These properties make the x87 stack usable as seven freely addressable registers plus a dedicated accumulator (or as seven independent accumulators). This is especially applicable on

superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...

x86 processors (such as the

Pentium Pentium is a brand used for a series of x86 architecture-compatible microprocessors produced by Intel. The original Pentium processor from which the brand took its name was first released on March 22, 1993. After that, the Pentium II and P ...

of 1993 and later), where these exchange instructions (codes D9C8..D9CF_h) are optimized down to a zero clock penalty by using one of the integer paths for FXCH ST(''x'') in parallel with the FPU instruction. Despite being natural and convenient for human

assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...

programmers, some compiler writers have found it complicated to construct automatic code generators that schedule x87 code effectively. Such a stack-based interface potentially can minimize the need to save scratch variables in function calls compared with a register-based interface (although, historically, design issues in the 8087 implementation limited that potential.) The x87 provides single-precision, double-precision and 80-bit double-extended precision binary floating-point arithmetic as per the

IEEE 754-1985 IEEE 754-1985 was an industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revision IEEE 754-2019. During its 23 years, it was th ...

standard. By default, the x87 processors all use 80-bit double-extended precision internally (to allow sustained precision over many calculations, see IEEE 754 design rationale). A given sequence of arithmetic operations may thus behave slightly differently compared to a strict single-precision or double-precision IEEE 754 FPU. As this may sometimes be problematic for some semi-numerical calculations written to assume double precision for correct operation, to avoid such problems, the x87 can be configured using a special configuration/status register to automatically round to single or double precision after each operation. Since the introduction of

SSE2 SSE2 (Streaming SIMD Extensions 2) is one of the Intel SIMD (Single Instruction, Multiple Data) processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2000. It extends the earlier SSE i ...

, the x87 instructions are not as essential as they once were, but remain important as a high-precision scalar unit for numerical calculations sensitive to

round-off error A roundoff error, also called rounding error, is the difference between the result produced by a given algorithm using exact arithmetic and the result produced by the same algorithm using finite-precision, rounded arithmetic. Rounding errors are d ...

and requiring the

64-bit In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit CPUs and ALUs are those that are based on processor registers, address buses, or data buses of that size. A ...

mantissa precision and extended range available in the 80-bit format.

Performance

Clock cycle counts for examples of typical x87 FPU instructions (only register-register versions shown here). The ''A''...''B'' notation (minimum to maximum) covers timing variations dependent on transient pipeline status and the arithmetic precision chosen (32, 64 or 80 bits); it also includes variations due to numerical cases (such as the number of set bits, zero, etc.). The L → H notation depicts values corresponding to the lowest (L) and the highest (H) maximal clock frequencies that were available. :* An effective zero clock delay is often possible, via superscalar execution. :^§ The 5 MHz 8087 was the original x87 processor. Compared to typical software-implemented floating-point routines on an 8086 (without an 8087), the factors would be even larger, perhaps by another factor of 10 (i.e., a correct floating-point addition in assembly language may well consume over 1000 cycles).

Manufacturers

Companies that have designed or manufactured floating-point units compatible with the Intel 8087 or later models include AMD (''287'', ''387'', ''486DX'', ''5x86'', ''K5'', ''K6'', ''K7'', ''K8''),

Chips and Technologies Chips and Technologies (C&T), founded in Milpitas, California in December 1984 by Gordon A. Campbell and Dado Banatao, was an early fabless semiconductor company. Its first product, announced September 1985, was a four chip EGA chipset that ...

(the ''Super MATH'' coprocessors),

Cyrix Cyrix Corporation was a microprocessor developer that was founded in 1988 in Richardson, Texas, as a specialist supplier of floating point units for 286 and 386 microprocessors. The company was founded by Tom Brightman and Jerry Rogers. In ...

(the ''FasMath'', ''Cx87SLC'', ''Cx87DLC'', etc., ''6x86'', ''Cyrix MII''),

Fujitsu is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...

(early ''Pentium Mobile'' etc.),

Harris Semiconductor Harris Corporation was an American technology company, defense contractor, and information technology services provider that produced wireless equipment, tactical radios, electronic systems, night vision equipment and both terrestrial and space ...

(manufactured ''80387'' and ''486DX'' processors), IBM (various ''387'' and ''486'' designs), IDT (the

WinChip The WinChip series was a low-power Socket 7-based x86 processor designed by Centaur Technology and marketed by its parent company IDT. Overview Design The design of the WinChip was quite different from other processors of the time. Instead of ...

, ''C3'', ''C7'', ''Nano'', etc.), IIT (the ''2C87'', ''3C87'', etc.), LC Technology (the ''Green MATH'' coprocessors),

National Semiconductor National Semiconductor was an American semiconductor manufacturer which specialized in analog devices and subsystems, formerly with headquarters in Santa Clara, California. The company produced power management integrated circuits, display dr ...

(the ''Geode GX1'', ''Geode GXm'', etc.),

NexGen NexGen (Milpitas, California) was a private semiconductor company that designed x86 microprocessors until it was purchased by AMD in 1996. NexGen was a fabless design house that designed its chips but relied on other companies for production ...

(the ''Nx587''),

Rise Technology Rise Technology was a short lived microprocessor manufacturer that produced the Intel x86 MMX compatible mP6 processor. The mP6 was a microprocessor that was designed to perform a smaller number of types of computer instructions so that it can ...

(the ''mP6''),

ST Microelectronics STMicroelectronics N.V. commonly referred as ST or STMicro is a Dutch multinational corporation and technology company of French-Italian origin headquartered in Plan-les-Ouates near Geneva, Switzerland and listed on the French stock market. ST ...

(manufactured ''486DX'', ''5x86'', etc.),

Texas Instruments Texas Instruments Incorporated (TI) is an American technology company headquartered in Dallas, Texas, that designs and manufactures semiconductors and various integrated circuits, which it sells to electronics designers and manufacturers globa ...

(manufactured ''486DX'' processors etc.),

Transmeta Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software. Code Morphing S ...

(the ''TM5600'' and ''TM5800''), ULSI (the ''Math·Co'' coprocessors),

VIA Via or VIA may refer to the following: Science and technology * MOS Technology 6522, Versatile Interface Adapter * ''Via'' (moth), a genus of moths in the family Noctuidae * Via (electronics), a through-connection * VIA Technologies, a Taiwa ...

(the ''C3'', ''C7'', and ''Nano'', etc.), and Xtend (the ''83S87SX-25'' and other coprocessors).

Architectural generations

8087

The ''8087'' was the first math

coprocessor A coprocessor is a computer processor used to supplement the functions of the primary processor (the CPU). Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or I ...

for 16-bit processors designed by

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...

. It was built to be paired with the

Intel 8088 The Intel 8088 ("''eighty-eighty-eight''", also called iAPX 88) microprocessor is a variant of the Intel 8086. Introduced on June 1, 1979, the 8088 has an eight-bit external data bus instead of the 16-bit bus of the 8086. The 16-bit registers and ...

8086 The 8086 (also called iAPX 86) is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus (allo ...

microprocessors. (Intel's earlier 8231 and 8232 floating-point processors, marketed for use with the i8080 CPU, were in fact licensed versions of AMD's Am9511 and Am9512 FPUs from 1977 and 1979.)

80187

The ''80187'' (''80C187'') is the math coprocessor for the

Intel 80186 The Intel 80186, also known as the iAPX 186, or just 186, is a microprocessor and microcontroller introduced in 1982. It was based on the Intel 8086 and, like it, had a 16-bit external data bus multiplexed with a 20-bit address bus. The ...

CPU. It is incapable of operating with the 80188, as the 80188 has an eight-bit data bus; the 80188 can only use the 8087. The 80187 did not appear at the same time as the 80186 and 80188, but was in fact launched after the 80287 and the 80387. Although the interface to the main processor is the same as that of the 8087, its core is that of the 80387 and is thus fully

IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found ...

-compliant and capable of executing all the 80387's extra instructions.

80287

The ''80287'' (''i287'') is the math

for the

Intel 80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the ...

series of

microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circ ...

s. Intel's models included variants with specified upper frequency limits ranging from 6 up to 12 MHz. Later followed the i80287XL with 387 microarchitecture and the i80287XLT, a special version intended for laptops, as well as other variants. The available 10 MHz Intel 80287-10 Numerics Coprocessor version was for 250

USD The United States dollar (symbol: $; code: USD; also abbreviated US$ or U.S. Dollar, to distinguish it from other dollar-denominated currencies; referred to as the dollar, U.S. dollar, American dollar, or colloquially buck) is the official ...

in quantities of 100. The 80287XL is actually an 80387SX with a 287 pinout. It contains an internal 3/2 multiplier, so that motherboards that ran the coprocessor at 2/3 CPU speed could instead run the FPU at the same speed of the CPU. Other 287 models with 387-like performance are the Intel 80C287, built using

CHMOS CHMOS refers to one of a series of Intel CMOS processes developed from their HMOS process. (H stands for high-density). It was first developed in 1981. CHMOS was used in the Intel 80C51BH, a new version of their standard MCS-51 microcontroller ...

III, and the AMD 80EC287 manufactured in AMD's

CMOS Complementary metal–oxide–semiconductor (CMOS, pronounced "sea-moss", ) is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) fabrication process that uses complementary and symmetrical pairs of p-type and n-type MOSF ...

process, using only fully static gates. The 80287 and 80287XL work with the

80386 The Intel 386, originally released as 80386 and later renamed i386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistors Cyrix Cx486SLC The Cyrix Cx486SLC is a x86 microprocessor that was developed by Cyrix. It was one of Cyrix's first CPU offerings, released after years of selling math coprocessors that competed with Intel's units and offered better performance at a comparable ...

. However, for both of these chips the 80387 is strongly preferred for its higher performance and the greater capability of its instruction set. KL Intel C80287.jpg, 6 MHz version of the Intel 80287 Intel 80287 die.jpg, Intel 80287 die shot KL Intel i80287XL Big Markings.jpg, Intel 80287XL KL Intel 80287XLT.jpg, Intel 80287XLT

80387

The 80387 (387 or i387) is the first Intel coprocessor to be fully compliant with the

standard. Released in 1987, two years after the 386 chip, the i387 includes much improved speed over Intel's previous 8087/80287 coprocessors and improved characteristics of its trigonometric functions. It was made available for USD $500 in quantities of 100. Shortly afterwards, it was made available through Intel's Personal Computer Enhancement Operation for a retail market price of USD $795. The 8087 and 80287's FPTAN and FPATAN instructions are limited to an argument in the range ±π/4 (±45°), and the 8087 and 80287 have no ''direct'' instructions for the SIN and COS functions. Without a coprocessor, the 386 normally performs floating-point arithmetic through (relatively slow) software routines, implemented at runtime through a software exception handler. When a math coprocessor is paired with the 386, the coprocessor performs the floating-point arithmetic in hardware, returning results much faster than an (emulating) software library call. The i387 is compatible only with the standard i386 chip, which has a 32-bit processor bus. The later cost-reduced i386SX, which has a narrower 16-bit

data bus In computer architecture, a bus (shortened form of the Latin '' omnibus'', and historically also called data highway or databus) is a communication system that transfers data between components inside a computer, or between computers. This ...

, can not interface with the i387's 32-bit bus. The i386SX requires its own coprocessor, the

80387SX The Intel 80387SX (387SX or i387SX) is the math coprocessor, also called an FPU, for the Intel 80386SX microprocessor. Introduced in 1987, it was used to perform floating-point arithmetic operations directly in hardware. The coprocessor was desig ...

, which is compatible with the SX's narrower 16-bit data bus. File:KL Intel 80387.jpg, i387 File:KL Intel i387SX.jpg, i387SX File:KL intel i387DX.jpg, i387DX File:Intel 387 arch.svg, i387 microarchitecture with 16-bit

barrel shifter A barrel shifter is a digital circuit that can shift a data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. It can however in theory also ...

and

CORDIC CORDIC (for "coordinate rotation digital computer"), also known as Volder's algorithm, or: Digit-by-digit method Circular CORDIC (Jack E. Volder), Linear CORDIC, Hyperbolic CORDIC (John Stephen Walther), and Generalized Hyperbolic CORDIC (GH C ...

unit File:80386with387.JPG, i386DX with i387DX File:Socket for Intel 80387.jpg, Socket for the 80387

80487

The i487SX (P23N) was marketed as a

floating-point unit In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...

for Intel

i486SX Intel's i486SX was a modified Intel 486DX microprocessor with its floating-point unit (FPU) disabled. It was intended as a lower-cost CPU for use in low-end systems. Computer manufacturers that used these processors include Packard Bell, Compaq, ...

machines. It actually contained a full-blown i486DX implementation. When installed into an i486SX system, the i487 disabled the main CPU and took over all CPU operations. The i487 took measures to detect the presence of an i486SX and would not function without the original CPU in place.

80587

The Nx587 was the last FPU for x86 to be manufactured separately from the CPU, in this case NexGen's Nx586.

Notes

References

External links

Everything you always wanted to know about math coprocessors
{{Use dmy dates, date=April 2019 X86 architecture Stack machines Floating point Coprocessors