telecommunication Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than that ...

s, 8b/10b is a

line code In telecommunication, a line code is a pattern of voltage, current, or photons used to represent digital data transmitted down a communication channel or written to a storage medium. This repertoire of signals is usually called a constrained ...

that maps

8-bit In computer architecture, 8-bit integers or other data units are those that are 8 bits wide (1 octet). Also, 8-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers or data buses ...

words to

10-bit In computing, a word is the natural unit of data used by a particular processor design. A word is a fixed-sized datum handled as a unit by the instruction set or the hardware of the processor. The number of bits or digits in a word (the ''word s ...

symbol A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different conc ...

s to achieve

DC balance In signal processing, when describing a periodic function in the time domain, the DC bias, DC component, DC offset, or DC coefficient is the mean amplitude of the waveform. If the mean amplitude is zero, there is no DC bias. A waveform with no DC ...

and bounded disparity, and at the same time provide enough state changes to allow reasonable clock recovery. This means that the difference between the counts of ones and zeros in a string of ''at least'' 20 bits is no more than two, and that there are not more than five ones or zeros in a row. This helps to reduce the demand for the lower bandwidth limit of the channel necessary to transfer the signal. An 8b/10b code can be implemented in various ways, where the design may focus on specific parameters such as hardware requirements, DC-balance, etc. One implementation was designed by K. Odaka for the DAT digital audio recorder.

Kees Schouhamer Immink Kornelis Antonie "Kees" Schouhamer Immink (born 18 December 1946) is a Dutch scientist, inventor, and entrepreneur, who pioneered and advanced the era of digital audio, video, and data recording, including popular digital media such as Compact D ...

designed an 8b/10b code for the DCC audio recorder. The IBM implementation was described in 1983 by Al Widmer and Peter Franaszek.

IBM implementation

As the scheme name suggests, eight bits of data are transmitted as a 10-bit entity called a ''symbol'', or ''character''. The low five bits of data are encoded into a 6-bit group (the 5b/6b portion) and the top three bits are encoded into a 4-bit group (the 3b/4b portion). These code groups are concatenated together to form the 10-bit symbol that is transmitted on the wire. The ''data symbols'' are often referred to as D.x.y where x ranges over 0–31 and y over 0–7. Standards using the 8b/10b encoding also define up to 12 ''special symbols'' (or ''control characters'') that can be sent in place of a ''data symbol''. They are often used to indicate start-of-frame, end-of-frame, link idle, skip and similar link-level conditions. At least one of them (i.e. a "comma" symbol) needs to be used to define the alignment of the 10-bit symbols. They are referred to as K.x.y and have different encodings from any of the D.x.y symbols. Because 8b/10b encoding uses 10-bit symbols to encode 8-bit words, some of the possible 1024 (10 bit, 2¹⁰) symbols can be excluded to grant a run-length limit of 5 consecutive equal bits and to ensure the difference between the count of zeros and ones to be no more than two. Some of the 256 possible 8-bit words can be encoded in two different ways. Using these alternative encodings, the scheme is able to achieve long-term DC-balance in the serial data stream. This permits the data stream to be transmitted through a channel with a high-pass characteristic, for example

Ethernet Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 1 ...

transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...

-coupled unshielded twisted pair or optical receivers using automatic gain control.

Encoding tables

Note that in the following tables, for each input byte, ''A'' is the

least significant bit In computing, bit numbering is the convention used to identify the bit positions in a binary number. Bit significance and indexing In computing, the least significant bit (LSB) is the bit position in a binary integer representing the binar ...

, and ''H'' the most significant. The output gains two extra bits, i and j. The bits are sent low to high: a, b, c, d, e, i,  f, g, h, and j; i.e., the 5b/6b code followed by the 3b/4b code. This ensures the uniqueness of the special bit sequence in the comma symbols. The residual effect on the stream to the number of zero and one bits transmitted is maintained as the ''running disparity'' (''RD'') and the effect of slew is balanced by the choice of encoding for following symbols. The 5b/6b code is a

paired disparity code In telecommunication, a paired disparity code is a line code in which at least one of the data characters is represented by two codewords of opposite disparity that are used in sequence so as to minimize the total disparity of a longer sequence o ...

, and so is the 3b/4b code. Each 6- or 4-bit code word has either equal numbers of zeros and ones (a disparity of zero), or comes in a pair of forms, one with two more zeros than ones (four zeros and two ones, or three zeros and one one, respectively) and one with two less. When a 6- or 4-bit code is used that has a non-zero disparity (count of ones minus count of zeros; i.e., −2 or +2), the choice of positive or negative disparity encodings must be the one that toggles the running disparity. In other words, the non zero disparity codes alternate.

Running disparity

8b/10b coding is DC-free, meaning that the long-term ratio of ones and zeros transmitted is exactly 50%. To achieve this, the difference between the number of ones transmitted and the number of zeros transmitted is always limited to ±2, and at the end of each symbol, it is either +1 or −1. This difference is known as the ''running disparity'' (RD). This scheme needs only two states for the running disparity of +1 and −1. It starts at −1. For each 5b/6b and 3b/4b code with an unequal number of ones and zeros, there are two bit patterns that can be used to transmit it: one with two more "1" bits, and one with all bits inverted and thus two more zeros. Depending on the current running disparity of the signal, the encoding engine selects which of the two possible six- or four-bit sequences to send for the given data. Obviously, if the six-bit or four-bit code has equal numbers of ones and zeros, there is no choice to make, as the disparity would be unchanged, with the exceptions of sub-blocks D.07 (00111) and D.x.3 (011). In either case the disparity is still unchanged, but if RD is positive when D.07 is encountered 000111 is used, and if it is negative 111000 is used. Likewise, if RD is positive when D.x.3 is encountered 0011 is used, and if it is negative 1100 is used. This is accurately reflected in the charts below, but is worth making additional mention of as these are the only two sub-blocks with equal numbers of 1s and 0s that each have two possible encodings.

5b/6b code (abcdei)

† also used for the 5b/6b code of K.x.7 ‡ exclusively used for the 5b/6b code of K.28.y

3b/4b code (fghj)

† For D.x.7, either the Primary (D.x.P7), or the Alternate (D.x.A7) encoding must be selected in order to avoid a run of five consecutive 0s or 1s when combined with the preceding 5b/6b code.
Sequences of exactly five identical bits are used in comma symbols for synchronization issues.
D.x.A7 is used only * when RD = −1: for ''x'' = 17, 18 and 20 and * when RD = +1: for ''x'' = 11, 13 and 14. With ''x'' = 23, ''x'' = 27, ''x'' = 29, and ''x'' = 30, the 3b/4b code portion used for control symbols K.x.7 is the same as that for D.x.A7.
Any other D.x.A7 code can't be used as it would result in chances for misaligned comma sequences. ‡ Only K.28.1, K.28.5, and K.28.7 generate comma symbols, that contain a bit sequence of five 0s or 1s.
The symbol has the format 110000 01xx or 001111 10xx.

Control symbols

The control symbols within 8b/10b are 10b symbols that are valid sequences of bits (no more than six 1s or 0s) but do not have a corresponding 8b data byte. They are used for low-level control functions. For instance, in Fibre Channel, K28.5 is used at the beginning of four-byte sequences (called "Ordered Sets") that perform functions such as Loop Arbitration, Fill Words, Link Resets, etc. Resulting from the 5b/6b and 3b/4b tables the following 12 control symbols are allowed to be sent: † Within the control symbols, K.28.1, K.28.5, and K.28.7 are "comma symbols". Comma symbols are used for synchronization (finding the alignment of the 8b/10b codes within a bit-stream). If K.28.7 is not used, the unique comma sequences 00111110 or 11000001 cannot be found at any bit position within any combination of normal codes. ‡ If K.28.7 is allowed in the actual coding, a more complex definition of the synchronization pattern than suggested by † needs to be used, as a combination of K.28.7 with several other codes forms a false misaligned comma symbol overlapping the two codes. A sequence of multiple K.28.7 codes is not allowable in any case, as this would result in undetectable misaligned comma symbols. K.28.7 is the only comma symbol that cannot be the result of a single bit error in the data stream.

Example encoding of D31.1

Technologies that use 8b/10b

After the above-mentioned IBM patent expired, the scheme became even more popular and was chosen as a DC-free line code for several communication technologies. Among the areas in which 8b/10b encoding finds application are the following: * Aurora * Camera Serial Interface * CoaXPress *

Common Public Radio Interface The Common Public Radio Interface (CPRI) standard defines an interface between Radio Equipment Control (REC) and Radio Equipment (RE). Oftentimes, CPRI links are used to carry data between cell sites and base stations. The purpose of CPRI is to ...

(CPRI) *

DVB Digital Video Broadcasting (DVB) is a set of international open standards for digital television. DVB standards are maintained by the DVB Project, an international industry consortium, and are published by a Joint Technical Committee (JTC) o ...

Asynchronous serial interface (ASI) *

DVI Digital Visual Interface (DVI) is a video display interface developed by the Digital Display Working Group (DDWG). The digital interface is used to connect a video source, such as a video display controller, to a display device, such as a comp ...

and

HDMI High-Definition Multimedia Interface (HDMI) is a proprietary audio/video interface for transmitting uncompressed video data and compressed or uncompressed digital audio data from an HDMI-compliant source device, such as a display controlle ...

Video Island ( transition-minimized differential signaling) *

DisplayPort DisplayPort (DP) is a digital display interface developed by a consortium of PC and chip manufacturers and standardized by the Video Electronics Standards Association (VESA). It is primarily used to connect a video source to a display device su ...

1.x * ESCON (Enterprise Systems Connection) *

Fibre Channel Fibre Channel (FC) is a high-speed data transfer protocol providing in-order, lossless delivery of raw block data. Fibre Channel is primarily used to connect computer data storage to servers in storage area networks (SAN) in commercial data cen ...

Gigabit Ethernet In computer networking, Gigabit Ethernet (GbE or 1 GigE) is the term applied to transmitting Ethernet frames at a rate of a gigabit per second. The most popular variant, 1000BASE-T, is defined by the IEEE 802.3ab standard. It came into use ...

(except for the

twisted pair Twisted pair cabling is a type of wiring used for communications in which two conductors of a single circuit are twisted together for the purposes of improving electromagnetic compatibility. Compared to a single conductor or an untwisted b ...

–based

1000BASE-T In computer networking, Gigabit Ethernet (GbE or 1 GigE) is the term applied to transmitting Ethernet frames at a rate of a gigabit per second. The most popular variant, 1000BASE-T, is defined by the IEEE 802.3ab standard. It came into use ...

) * IEEE 1394b (FireWire and others) * InfiniBand * JESD204B * OBSAI RP3 interface *

PCI Express PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard, designed to replace the older PCI, PCI-X and AGP bus standards. It is the common ...

1.x and 2.x * Serial RapidIO * SD UHS-II *

Serial ATA SATA (Serial AT Attachment) is a computer bus interface that connects host adapter, host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives. Serial ATA succeeded the earlier Parallel ATA (PATA) ...

* SAS 1.x, 2.x and 3.x * SSA * ServerNet (starting with ServerNet2) *

SGMII The media-independent interface (MII) was originally defined as a standard interface to connect a Fast Ethernet (i.e., ) media access control (MAC) block to a PHY chip. The MII is standardized by IEEE 802.3u and connects different types of PHYs ...

* UniPro M-PHY * USB 3.0 * Thunderbolt 1.x and 2.x * XAUI * SLVS-EC

Fibre Channel (4GFC and 8GFC variants only)

The FC-0 standard defines what encoding scheme is to be used (8b/10b or 64b/66b) in a Fibre Channel system higher speed variants typically use 64b/66b to optimize bandwidth efficiency (since bandwidth overhead is 20% in 8b/10b versus approximately 3% (~ 2/66) in 64b/66b systems). Thus, 8b/10b encoding is used for 4GFC and 8GFC variants; for 10GFC and 16GFC variants, it is 64b/66b. The Fibre Channel ''FC1'' data link layer is then responsible for implementing the 8b/10b encoding and decoding of signals. The Fibre Channel 8b/10b coding scheme is also used in other telecommunications systems. Data is expanded using an algorithm that creates one of two possible 10-bit output values for each input 8-bit value. Each 8-bit input value can map either to a 10-bit output value with odd disparity, or to one with even disparity. This mapping is usually done at the time when parallel input data is converted into a serial output stream for transmission over a fibre channel link. The odd/even selection is done in such a way that a long-term zero disparity between ones and zeroes is maintained. This is often called "DC balancing". The 8-bit to 10-bit conversion scheme uses only 512 of the possible 1024 output values. Of the remaining 512 unused output values, most contain either too many ones (or too many zeroes) and therefore are not allowed. This still leaves enough spare 10-bit odd+even coding pairs to allow for at least 12 special non-data characters. The codes that represent the 256 data values are called the data (D) codes. The codes that represent the 12 special non-data characters are called the control (K) codes. All of the codes can be described by stating 3 octal values. This is done with a naming convention of "Dxx.x" or "Kxx.x". Example: :Input Data Bits: ABCDEFGH :Data is split: ABC DEFGH :Data is shuffled: DEFGH ABC Now these bits are converted to decimal in the way they are paired. Input data C3 (HEX) = 11000011 = 110 00011 = 00011 110 = 3 6 E 8B/10B = D03.6

Digital audio

Encoding schemes 8b/10b have found a heavy use in digital audio storage applications, namely * Digital Audio Tape, US Patent 4,456,905, June 1984 by K. Odaka. * Digital Compact Cassette (DCC), US Patent 4,620,311, October 1986 by

. A differing but related scheme is used for audio CDs and

CD-ROM A CD-ROM (, compact disc read-only memory) is a type of read-only memory consisting of a pre-pressed optical compact disc that contains data. Computers can read—but not write or erase—CD-ROMs. Some CDs, called enhanced CDs, hold both com ...

s: *

Compact disc The compact disc (CD) is a digital optical disc data storage format that was co-developed by Philips and Sony to store and play digital audio recordings. In August 1982, the first compact disc was manufactured. It was then released in O ...

Eight-to-fourteen modulation Eight-to-fourteen modulation (EFM) is a data encoding technique – formally, a ''line code'' – used by compact discs (CD), laserdiscs (LD) and pre-Hi-MD MiniDiscs. EFMPlus is a related code, used in DVDs and Super Audio CDs (SACDs). EFM and E ...

Alternatives

Note that 8b/10b is the encoding scheme, not a specific code. While many applications do use the same code, there exist some incompatible implementations; for example, Transition Minimized Differential Signaling, which also expands 8 bits to 10 bits, but it uses a completely different method to do so. 64b/66b encoding, introduced for 10 Gigabit Ethernet's 10GBASE-R

Physical Medium Dependent Physical medium dependent sublayers or PMDs further help to define the physical layer of computer network protocols. They define the details of transmission and reception of individual bits on a physical medium. These responsibilities encompass b ...

(PMD) interfaces, is a lower-overhead alternative to 8b/10b encoding, having a two-bit overhead per 64 bits (instead of eight bits) of encoded data. This scheme is considerably different in design from 8b/10b encoding, and does not explicitly guarantee DC balance, short run length, and transition density (these features are achieved statistically via scrambling). 64b/66b encoding has been extended to the 128b/130b and 128b/132b encoding variants for PCI Express 3.0 and USB 3.1, respectively, replacing the 8b/10b encoding in earlier revisions of each standard.

References

External links

* {{DEFAULTSORT:8b 10b Encoding Telecommunications standards Line codes Fibre Channel Encodings