HOME

TheInfoList



OR:

Chen–Ho encoding is a memory-efficient alternate system of binary encoding for decimal digits. The traditional system of binary encoding for decimal digits, known as
binary-coded decimal In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used fo ...
(BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10), even when using packed BCD. The encoding reduces the storage requirements of two decimal digits (100 states) from 8 to 7 bits, and those of three decimal digits (1000 states) from 12 to 10 bits using only simple
Boolean Any kind of logic, function, expression, or theory based on the work of George Boole is considered Boolean. Related to this, "Boolean" may refer to: * Boolean data type, a form of data with only two possible values (usually "true" and "false" ...
transformations avoiding any complex arithmetic operations like a
base conversion Positional notation (or place-value notation, or positional numeral system) usually denotes the extension to any base of the Hindu–Arabic numeral system (or decimal system). More generally, a positional system is a numeral system in which th ...
.


History

In what appears to have been a
multiple discovery Multiple may refer to: Economics *Multiple finance, a method used to analyze stock prices *Multiples of the price-to-earnings ratio *Chain stores, are also referred to as 'Multiples' * Box office multiple, the ratio of a film's total gross to th ...
, some of the concepts behind what later became known as Chen–Ho encoding were independently developed by Theodore M. Hertz in 1969 and by Tien Chi Chen () (1928–) in 1971. Hertz of Rockwell filed a patent for his encoding in 1969, which was granted in 1971. Chen first discussed his ideas with Irving Tze Ho () (1921–2003) in 1971. Chen and Ho were both working for IBM at the time, albeit in different locations. Chen also consulted with Frank Chin Tung to verify the results of his theories independently. IBM filed a patent in their name in 1973, which was granted in 1974. At least by 1973, Hertz's earlier work must have been known to them, as the patent cites his patent as
prior art Prior art (also known as state of the art or background art) is a concept in patent law used to determine the patentability of an invention, in particular whether an invention meets the novelty and the inventive step or non-obviousness criteria ...
. With input from Joseph D. Rutledge and John C. McPherson, the final version of the Chen–Ho encoding was circulated inside IBM in 1974 and published in 1975 in the journal ''
Communications of the ACM ''Communications of the ACM'' is the monthly journal of the Association for Computing Machinery (ACM). It was established in 1958, with Saul Rosen as its first managing editor. It is sent to all ACM members. Articles are intended for readers with ...
''. This version included several refinements, primarily related to the application of the encoding system. It constitutes a Huffman-like prefix code. The encoding was referred to as ''Chen and Ho's scheme'' in 1975, ''Chen's encoding'' in 1982 and became known as ''Chen–Ho encoding'' or ''Chen–Ho algorithm'' since 2000. After having filed a patent for it in 2001,
Michael F. Cowlishaw Mike Cowlishaw is a visiting professor at the Department of Computer Science at the University of Warwick, and a Fellow of the Royal Academy of Engineering. He is a retired IBM Fellow, and was a Fellow of the Institute of Engineering and Techn ...
published a further refinement of Chen–Ho encoding known as
densely packed decimal Densely packed decimal (DPD) is an efficient method for binary encoding decimal digits. The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in signi ...
(DPD) encoding in ''IEE Proceedings – Computers and Digital Techniques'' in 2002. DPD has subsequently been adopted as the ''decimal encoding'' used in the
IEEE 754-2008 The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operati ...
and ISO/IEC/IEEE 60559:2011
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
standards.


Application

Chen noted that the digits zero through seven were simply encoded using three binary digits of the corresponding
octal The octal numeral system, or oct for short, is the radix, base-8 number system, and uses the Numerical digit, digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, ...
group. He also postulated that one could use a
flag A flag is a piece of fabric (most often rectangular or quadrilateral) with a distinctive design and colours. It is used as a symbol, a signalling device, or for decoration. The term ''flag'' is also used to refer to the graphic design emp ...
to identify a different encoding for the digits eight and nine, which would be encoded using a single bit. In practice, a series of
Boolean Any kind of logic, function, expression, or theory based on the work of George Boole is considered Boolean. Related to this, "Boolean" may refer to: * Boolean data type, a form of data with only two possible values (usually "true" and "false" ...
transformations are applied to the stream of input bits, compressing BCD encoded digits from 12 bits per three digits to 10 bits per three digits. Reversed transformations are used to decode the resulting coded stream to BCD. Equivalent results can also be achieved by the use of a
look-up table In computer science, a lookup table (LUT) is an array that replaces runtime computation with a simpler array indexing operation. The process is termed as "direct addressing" and LUTs differ from hash tables in a way that, to retrieve a value v wi ...
. Chen–Ho encoding is limited to encoding sets of three decimal digits into groups of 10 bits (so called '' declets''). Of the 1024 states possible by using 10 bits, it leaves only 24 states unused (with don't care bits typically set to 0 on write and ignored on read). With only 0.34% wastage it gives a 20% more efficient encoding than BCD with one digit in 4 bits. Both, Hertz and Chen also proposed similar, but less efficient, encoding schemes to compress sets of two decimal digits (requiring 8 bits in BCD) into groups of 7 bits. Larger sets of decimal digits could by divided into three- and two-digit groups. The patents also discuss the possibility to adapt the scheme to digits encoded in any other decimal codes than
8-4-2-1 BCD In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used fo ...
, like f.e. Excess-3,
Excess-6 Offset binary, also referred to as excess-K, excess-''N'', excess-e, excess code or biased representation, is a method for signed number representation where a signed number n is represented by the bit pattern corresponding to the unsigned numb ...
, Jump-at-2, Jump-at-8,
Gray Grey (more common in British English) or gray (more common in American English) is an intermediate color between black and white. It is a neutral or achromatic color, meaning literally that it is "without color", because it can be composed ...
, Glixon, O'Brien type-I and
Gray–Stibitz code The reflected binary code (RBC), also known as reflected binary (RB) or Gray code after Frank Gray, is an ordering of the binary numeral system such that two successive values differ in only one bit (binary digit). For example, the representa ...
. The same principles could also be applied to other bases. In 1973, some form of Chen–Ho encoding appears to have been utilized in the address conversion hardware of the optional IBM 7070/ 7074 emulation feature for the IBM System/370 Model 165 and
370 Model 168 37 may refer to: * 37 (number), the natural number following 36 and preceding 38 Years * 37 BC * AD 37 * 1937 * 2037 Other uses * ''37'' (album), by King Never, 2013 * ''37'' (film), a 2016 film about the murder of Kitty Genovese * 37 (MBTA bu ...
computers. One prominent application uses a 128-bit register to store 33 decimal digits with a three digit exponent, effectively not less than what could be achieved using binary encoding (whereas BCD encoding would need 144 bits to store the same number of digits).


Encodings for two decimal digits


Hertz encoding

* This encoding is not
parity Parity may refer to: * Parity (computing) ** Parity bit in computing, sets the parity of data for the purpose of error detection ** Parity flag in computing, indicates if the number of set bits is odd or even in the binary representation of the r ...
-preserving.


Early Chen–Ho encoding, method A

* This encoding is not parity-preserving.


Early Chen–Ho encoding, method B

* This encoding is not parity-preserving.


Patented and final Chen–Ho encoding

* Assuming certain values for the don't-care bits (f.e. 0), this encoding is
parity Parity may refer to: * Parity (computing) ** Parity bit in computing, sets the parity of data for the purpose of error detection ** Parity flag in computing, indicates if the number of set bits is odd or even in the binary representation of the r ...
-preserving.


Encodings for three decimal digits


Hertz encoding

* This encoding is not parity-preserving.


Early Chen–Ho encoding

* This encoding is not parity-preserving.


Patented Chen–Ho encoding

* This encoding is not parity-preserving.


Final Chen–Ho encoding

* This encoding is not parity-preserving.


Storage efficiency


See also

*
Binary-coded decimal In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used fo ...
(BCD) *
Densely packed decimal Densely packed decimal (DPD) is an efficient method for binary encoding decimal digits. The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in signi ...
(DPD) *
DEC RADIX 50 RADIX 50 or RAD50 (also referred to as RADIX50, RADIX-50 or RAD-50), is an uppercase-only character encoding created by Digital Equipment Corporation (DEC) for use on their DECsystem, PDP, and VAX computers. RADIX 50's 40-character ...
/ MOD40 *
IBM SQUOZE SQUOZE (abbreviated as SQZ) is a memory-efficient representation of a combined source and relocatable object program file with a symbol table on punched cards which was introduced in 1958 with the SCAT assembler on the SHARE Operating Syst ...
* Packed BCD *
Unicode transformation format Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, whi ...
(UTF) (similar encoding scheme) * Length-limited Huffman code


Notes


References


Further reading

* * * (60 pages

(40 pages

and (11 pages

(NB. Three expired patents cited in both, the and s.) * * * {{DEFAULTSORT:Chen-Ho encoding Binary arithmetic