The bfloat16 (Brain Floating Point) floating-point format is a
computer number format
A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as groupings of bits, such as bytes and words. The ...
occupying
16 bits in
computer memory
In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term '' primary storage ...
; it represents a wide
dynamic range
Dynamic range (abbreviated DR, DNR, or DYR) is the ratio between the largest and smallest values that a certain quantity can assume. It is often used in the context of signals, like sound and light. It is measured either as a ratio or as a base- ...
of numeric values by using a
floating radix point. This format is a truncated (16-bit) version of the 32-bit
IEEE 754 single-precision floating-point format (binary32) with the intent of
accelerating machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
and
near-sensor computing. It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8
exponent bits, but supports only an 8-bit precision rather than the 24-bit
significand
The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consisting of its significant digits. Depending on ...
of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use. Bfloat16 is used to reduce the storage requirements and increase the calculation speed of machine learning algorithms.
The bfloat16 format was developed by
Google Brain
Google Brain is a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combines open-ended machine learning research ...
, an artificial intelligence research group at Google.
The bfloat16 format is utilized in Intel
AI processors, such as
Nervana NNP-L1000,
Xeon
Xeon ( ) is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded system markets. It was introduced in June 1998. Xeon processors are based on the same a ...
processors (
AVX-512 AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and implemented in Intel's Xeon Phi x200 (Knights Landing) and Skylake-X CPUs; ...
BF16 extensions), and Intel
FPGA
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term ''Field-programmability, field-programmable''. The FPGA configuration is generally specifi ...
s,
Google Cloud
TPUs,
and
TensorFlow
TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learning ...
.
ARMv8.6-A,
AMD
Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufact ...
ROCm
ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous c ...
, and
CUDA
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
also support the bfloat16 format. On these platforms, bfloat16 may also be used in
mixed-precision arithmetic Mixed-precision arithmetic is a form of floating-point arithmetic that uses numbers with varying widths in a single operation.
Arithmetic
A common usage of mixed-precision arithmetic is for operating on inaccurate numbers with a small width and exp ...
, where bfloat16 numbers may be operated on and expanded to wider data types.
bfloat16 floating-point format
bfloat16 has the following format:
*
Sign bit
In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in the most significant bit position, so the te ...
: 1 bit
*
Exponent
Exponentiation is a mathematical operation, written as , involving two numbers, the '' base'' and the ''exponent'' or ''power'' , and pronounced as " (raised) to the (power of) ". When is a positive integer, exponentiation corresponds to re ...
width: 8 bits
*
Significand
The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consisting of its significant digits. Depending on ...
precision: 8 bits (7 explicitly stored), as opposed to 24 bits in a classical single-precision floating-point format
The bfloat16 format, being a truncated
IEEE 754 single-precision 32-bit float, allows for fast
conversion to and from an IEEE 754 single-precision 32-bit float; in conversion to the bfloat16 format, the exponent bits are preserved while the significand field can be reduced by truncation (thus corresponding to
round toward 0), ignoring the
NaN
Nan or NAN may refer to:
Places China
* Nan County, Yiyang, Hunan, China
* Nan Commandery, historical commandery in Hubei, China
Thailand
* Nan Province
** Nan, Thailand, the administrative capital of Nan Province
* Nan River
People Given na ...
special case. Preserving the exponent bits maintains the 32-bit float's range of ≈ 10
−38 to ≈ 3 × 10
38.
The bits are laid out as follows:
Contrast with bfloat16 and single precision
Legend
*
*
*
*
Exponent encoding
The bfloat16 binary floating-point exponent is encoded using an
offset-binary
Offset binary, also referred to as excess-K, excess-''N'', excess-e, excess code or biased representation, is a method for signed number representation where a signed number n is represented by the bit pattern corresponding to the unsigned numb ...
representation, with the zero offset being 127; also known as exponent bias in the IEEE 754 standard.
* E
min = 01
H−7F
H = −126
* E
max = FE
H−7F
H = 127
*
Exponent bias
In IEEE 754 floating-point numbers, the exponent is biased in the engineering sense of the word – the value stored is offset from the actual value by the exponent bias, also called a biased exponent.
Biasing is done because exponents have to be ...
= 7F
H = 127
Thus, in order to get the true exponent as defined by the offset-binary representation, the offset of 127 has to be subtracted from the value of the exponent field.
The minimum and maximum values of the exponent field (00
H and FF
H) are interpreted specially, like in the IEEE 754 standard formats.
The minimum positive normal value is 2
−126 ≈ 1.18 × 10
−38 and the minimum positive (subnormal) value is 2
−126−7 = 2
−133 ≈ 9.2 × 10
−41.
Encoding of special values
Positive and negative infinity
Just as in
IEEE 754
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found i ...
, positive and negative infinity are represented with their corresponding
sign bit
In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in the most significant bit position, so the te ...
s, all 8 exponent bits set (FF
hex) and all significand bits zero. Explicitly,
val s_exponent_signcnd
+inf = 0_11111111_0000000
-inf = 1_11111111_0000000
Not a Number
Just as in
IEEE 754
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found i ...
,
NaN
Nan or NAN may refer to:
Places China
* Nan County, Yiyang, Hunan, China
* Nan Commandery, historical commandery in Hubei, China
Thailand
* Nan Province
** Nan, Thailand, the administrative capital of Nan Province
* Nan River
People Given na ...
values are represented with either sign bit, all 8 exponent bits set (FF
hex) and not all significand bits zero. Explicitly,
val s_exponent_signcnd
+NaN = 0_11111111_klmnopq
-NaN = 1_11111111_klmnopq
where at least one of ''k, l, m, n, o, p,'' or ''q'' is 1. As with IEEE 754, NaN values can be quiet or signaling, although there are no known uses of signaling bfloat16 NaNs as of September 2018.
Range and precision
Bfloat16 is designed to maintain the number range from the 32-bit
IEEE 754 single-precision floating-point format (binary32), while reducing the precision from 24 bits to 8 bits. This means that the precision is between two and three decimal digits, and bfloat16 can represent finite values up to about 3.4 × 10
38.
Examples
These examples are given in bit ''representation'', in
hexadecimal
In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, h ...
and
binary
Binary may refer to:
Science and technology Mathematics
* Binary number, a representation of numbers using only two digits (0 and 1)
* Binary function, a function that takes two arguments
* Binary operation, a mathematical operation that ta ...
, of the floating-point value. This includes the sign, (biased) exponent, and significand.
3f80 = 0 01111111 0000000 = 1
c000 = 1 10000000 0000000 = −2
7f7f = 0 11111110 1111111 = (2
8 − 1) × 2
−7 × 2
127 ≈ 3.38953139 × 10
38 (max finite positive value in bfloat16 precision)
0080 = 0 00000001 0000000 = 2
−126 ≈ 1.175494351 × 10
−38 (min normalized positive value in bfloat16 precision and single-precision floating point)
The maximum positive finite value of a normal bfloat16 number is 3.38953139 × 10
38, slightly below (2
24 − 1) × 2
−23 × 2
127 = 3.402823466 × 10
38, the max finite positive value representable in single precision.
Zeros and infinities
0000 = 0 00000000 0000000 = 0
8000 = 1 00000000 0000000 = −0
7f80 = 0 11111111 0000000 = infinity
ff80 = 1 11111111 0000000 = −infinity
Special values
4049 = 0 10000000 1001001 = 3.140625 ≈ π ( pi )
3eab = 0 01111101 0101011 = 0.333984375 ≈ 1/3
NaNs
ffc1 = x 11111111 1000001 => qNaN
ff81 = x 11111111 0000001 => sNaN
See also
*
Half-precision floating-point format
In computing, half precision (sometimes called FP16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory. It is intended for storage of floating-point values in applications ...
: 16-bit float w/ 1-bit sign, 5-bit exponent, and 11-bit significand, as defined by
IEEE 754
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found i ...
*
ISO/IEC 10967
ISO/IEC 10967, Language independent arithmetic (LIA), is a series of
standards on computer arithmetic. It is compatible with ISO/IEC/IEEE 60559:2011,
more known as IEEE 754-2008, and much of the
specifications are for IEEE 754 special values
(tho ...
, Language Independent Arithmetic
*
Primitive data type
In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled ...
*
Minifloat
*
Google Brain
Google Brain is a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combines open-ended machine learning research ...
References
{{DEFAULTSORT:bfloat16 floating-point format
Binary arithmetic
Floating point types