computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...

, octuple precision is a binary

floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...

-based

computer number format A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as groupings of bits, such as bytes and words. The ...

that occupies 32

byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...

s (256

bit The bit is the most basic unit of information in computing and digital communication. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as ...

s) in computer memory. This 256-

octuple precision is for applications requiring results in higher than

quadruple precision In computing, quadruple precision (or quad precision) is a binary Floating-point arithmetic, floating-point–based computer number format that occupies 16 bytes (128 bits) with precision at least twice the 53-bit Double-precision floating-point ...

. The range greatly exceeds what is needed to describe all known physical limitations within the observable universe or precisions better than

Planck units In particle physics and physical cosmology, Planck units are a system of units of measurement defined exclusively in terms of four universal physical constants: ''Speed of light, c'', ''Gravitational constant, G'', ''Reduced Planck constant, ħ ...

IEEE 754 octuple-precision binary floating-point format: binary256

In its 2008 revision, the

IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic originally established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard #Design rationale, add ...

standard specifies a binary256 format among the ''interchange formats'' (it is not a basic format), as having: *

Sign bit In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in the most significant bit position, so the term ...

: 1 bit *

Exponent In mathematics, exponentiation, denoted , is an operation involving two numbers: the ''base'', , and the ''exponent'' or ''power'', . When is a positive integer, exponentiation corresponds to repeated multiplication of the base: that is, i ...

width: 19 bits *

Significand The significand (also coefficient, sometimes argument, or more ambiguously mantissa, fraction, or characteristic) is the first (left) part of a number in scientific notation or related concepts in floating-point representation, consisting of its s ...

precision: 237 bits (236 explicitly stored) The format is written with an implicit lead bit with value 1 unless the exponent is all zeros. Thus only 236 bits of the

significand The significand (also coefficient, sometimes argument, or more ambiguously mantissa, fraction, or characteristic) is the first (left) part of a number in scientific notation or related concepts in floating-point representation, consisting of its s ...

appear in the memory format, but the total precision is 237 bits (approximately 71 decimal digits: ). The bits are laid out as follows: Octuple precision visual demonstration

Exponent encoding

The octuple-precision binary floating-point exponent is encoded using an

offset binary Offset binary, also referred to as excess-K, excess-''N'', excess-e, excess code or biased representation, is a method for signed number representation where a signed number n is represented by the bit pattern corresponding to the unsigned numb ...

representation, with the zero offset being 262143; also known as exponent bias in the IEEE 754 standard. * E_min = −262142 * E_max = 262143 *

Exponent bias In IEEE 754 floating-point numbers, the exponent is biased in the engineering sense of the word – the value stored is offset from the actual value by the exponent bias, also called a biased exponent. Biasing is done because exponents have to be ...

= 3FFFF₁₆ = 262143 Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 262143 has to be subtracted from the stored exponent. The stored exponents 00000₁₆ and 7FFFF₁₆ are interpreted specially. The minimum strictly positive (subnormal) value is and has a precision of only one bit. The minimum positive normal value is 2^−262142 ≈ 2.4824 × 10⁻⁷⁸⁹¹³. The maximum representable value is 2²⁶²¹⁴⁴ − 2²⁶¹⁹⁰⁷ ≈ 1.6113 × 10⁷⁸⁹¹³.

Octuple-precision examples

These examples are given in bit ''representation'', in

hexadecimal Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...

, of the floating-point value. This includes the sign, (biased) exponent, and significand. 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = +0 8000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = −0 7fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = +infinity ffff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = −infinity 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001₁₆ = 2^−262142 × 2⁻²³⁶ = 2^−262378 ≈ 2.24800708647703657297018614776265182597360918266100276294348974547709294462 × 10⁻⁷⁸⁹⁸⁴ (smallest positive subnormal number) 0000 0fff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff₁₆ = 2^−262142 × (1 − 2⁻²³⁶) ≈ 2.4824279514643497882993282229138717236776877060796468692709532979137875392 × 10⁻⁷⁸⁹¹³ (largest subnormal number) 0000 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = 2^−262142 ≈ 2.48242795146434978829932822291387172367768770607964686927095329791378756168 × 10⁻⁷⁸⁹¹³ (smallest positive normal number) 7fff efff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff₁₆ = 2²⁶²¹⁴³ × (2 − 2⁻²³⁶) ≈ 1.61132571748576047361957211845200501064402387454966951747637125049607182699 × 10⁷⁸⁹¹³ (largest normal number) 3fff efff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff₁₆ = 1 − 2⁻²³⁷ ≈ 0.999999999999999999999999999999999999999999999999999999999999999999999995472 (largest number less than one) 3fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = 1 (one) 3fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001₁₆ = 1 + 2⁻²³⁶ ≈ 1.00000000000000000000000000000000000000000000000000000000000000000000000906 (smallest number larger than one) By default, 1/3 rounds down like

double precision Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point arithmetic, floating-point computer number format, number format, usually occupying 64 Bit, bits in computer memory; it represents a wide range of numeri ...

, because of the odd number of bits in the significand. So the bits beyond the rounding point are 0101... which is less than 1/2 of a

unit in the last place In computer science and numerical analysis, unit in the last place or unit of least precision (ulp) is the spacing between two consecutive floating-point numbers, i.e., the value the '' least significant digit'' (rightmost digit) represents if it ...

Implementations

Octuple precision is rarely implemented since usage of it is extremely rare.

Apple Inc. Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley. It is best known for its consumer electronics, software, and services. Founded in 1976 as Apple Comput ...

had an implementation of addition, subtraction and multiplication of octuple-precision numbers with a 224-bit

two's complement Two's complement is the most common method of representing signed (positive, negative, and zero) integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the ''greatest'' value as the ''s ...

significand and a 32-bit exponent. One can use general

arbitrary-precision arithmetic In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are po ...

libraries to obtain octuple (or higher) precision, but specialized octuple-precision implementations may achieve higher performance.

Hardware support

There is no known hardware implementation of octuple precision.

IEEE 754 octuple-precision binary floating-point format: binary256

Exponent encoding

Octuple-precision examples

Implementations

Hardware support

See also

References

Further reading