842 (compression Algorithm)
   HOME

TheInfoList



OR:

''842'', ''8-4-2'', or ''EFT'' is a data compression algorithm. It is a variation on Lempel–Ziv compression with a limited dictionary length. With typical data, 842 gives 80 to 90 percent of the compression of LZ77 with much faster throughput and less memory use. Hardware implementations also provide minimal use of energy and minimal chip area. 842 compression can be used for virtual memory compression, for databases — especially column-oriented stores, and when streaming input-output — for example to do backups or to write to
log file In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or ...
s.


Algorithm

The algorithm operates on blocks of 8 bytes with sub-phrases of 8, 4 and 2 bytes. A hash of each phrase is used to look up a hash table with offsets to a sliding window buffer of past encoded data. Matches can be replaced by the offset, so the result for each block can be some mixture of matched data and new literal data.


Implementations

IBM added hardware accelerators and instructions for 842 compression to their
Power Power most often refers to: * Power (physics), meaning "rate of doing work" ** Engine power, the power put out by an engine ** Electric power * Power (social and political), the ability to influence people or events ** Abusive power Power may ...
processors from POWER7+ onward. In addition,
POWER9 POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. It was announced in August 2016. The POWER9-based processors are being manufactured using a 14 nm FinFET process ...
and
Power10 Power10 is a superscalar, multithreading, multi-core microprocessor family, based on the open source Power ISA, and announced in August 2020 at the Hot Chips conference; systems with Power10 CPUs. Generally available from September 2021 in t ...
added hardware acceleration for th
RFC 1951
Deflate algorithm, which is used by
zlib zlib ( or "zeta-lib", ) is a software library used for data compression. zlib was written by Jean-loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also a ...
and
gzip gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and ...
. A
device driver In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabling operating systems and o ...
for hardware-assisted 842 compression on a POWER processor was added to the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
in 2011. More recently, Linux can fallback to a software implementation, which of course is much slower.
zram zram, formerly called compcache, is a Linux kernel module for creating a compressed block device in RAM, i.e. a RAM disk with on-the-fly disk compression. The block device created with zram can then be used for swap or as general-purpose RAM ...
, a
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
kernel module for compressed
RAM drive Ram, ram, or RAM may refer to: Animals * A male sheep * Ram cichlid, a freshwater tropical fish People * Ram (given name) * Ram (surname) * Ram (director) (Ramsubramaniam), an Indian Tamil film director * RAM (musician) (born 1974), Dutch * ...
s, can be configured to use 842. Researchers have implemented 842 using
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mo ...
s and found about 30x faster decompression using dedicated GPUs. An open source library provides 842 for
CUDA CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
and
OpenCL OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-prog ...
. An
FPGA A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term ''Field-programmability, field-programmable''. The FPGA configuration is generally specifi ...
implementation of 842 demonstrated 13 times better throughput than a software implementation.


References

{{Compression Methods Lossless compression algorithms