zlib ( or "
zeta
Zeta (, ; uppercase Ζ, lowercase ζ; , , classical or ''zē̂ta''; ''zíta'') is the sixth letter of the Greek alphabet. In the system of Greek numerals, it has a value of 7. It was derived from the Phoenician alphabet, Phoenician letter zay ...
-lib", ) is a
software library used for
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
as well as a data format.
[ zlib was written by Jean-loup Gailly and Mark Adler and is an ]abstraction
Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods.
"An abstraction" ...
of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also a crucial component of many software platforms, including Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
, and iOS. It has also been used in gaming consoles such as the PlayStation 4, PlayStation 3
The PlayStation 3 (PS3) is a home video game console developed and marketed by Sony Computer Entertainment (SCE). It is the successor to the PlayStation 2, and both are part of the PlayStation brand of consoles. The PS3 was first released on ...
, Wii U, Wii, Xbox One
The Xbox One is a home video game console developed by Microsoft. Announced in May 2013, it is the successor to Xbox 360 and the third console in the Xbox#Consoles, Xbox series. It was first released in North America, parts of Europe, Austra ...
and Xbox 360
The Xbox 360 is a home video game console developed by Microsoft. As the successor to the Xbox (console), original Xbox, it is the second console in the Xbox#Consoles, Xbox series. It was officially unveiled on MTV on May 12, 2005, with detail ...
.
The first public version of Zlib, 0.9, was released on 1 May 1995 and was originally intended for use with the libpng image library. It is free software
Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
, distributed under the zlib License.
Capabilities
Encapsulation
Raw DEFLATE compressed data (RFC 1951) are typically written with a zlib or gzip wrapper encapsulating the data, by adding a header and footer. This provides stream identification and error detection that are not provided by the raw DEFLATE data.
The zlib wrapper (RFC 1950) is smaller than the gzip wrapper (RFC 1952), as the latter stores a file name and other file system information.
Algorithm
, zlib only supports one algorithm, called DEFLATE, which uses a combination of a variation of LZ77
LZ77 and LZ78 are the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978.
They are also known as Lempel-Ziv 1 (LZ1) and Lempel-Ziv 2 (LZ2) respectively. These two algorithms form the basis ...
(Lempel–Ziv 1977) and Huffman coding
In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by ...
. This algorithm provides good compression on a wide variety of data with minimal use of system resources. This is also the algorithm used in the Zip archive format. The header makes allowance for other algorithms, but none are currently implemented.
Resource use
zlib provides facilities for control of processor and memory use. A compression level value may be supplied that trades speed for compression. There are also facilities for conserving memory, useful in restricted memory environments, such as some embedded systems.
Strategy
The compression can be optimized for specific types of data. If one is using the library to always compress specific types of data, then using a specific strategy may improve compression and performance. For example, if the data contain long lengths of repeated bytes, the run-length encoding
Run-length encoding (RLE) is a form of lossless data compression in which ''runs'' of data (consecutive occurrences of the same data value) are stored as a single occurrence of that data value and a count of its consecutive occurrences, rather th ...
(RLE) strategy may give good results at a higher speed. For general data, the default strategy is preferred.
Error handling
Errors in compressed data may be detected and skipped. Further, if "full-flush" points are written to the compressed stream, then corrupt data can be skipped, and the decompression will resynchronize at the next flush point—although no error recovery of the corrupt data is provided. Full-flush points are useful for large data streams on unreliable channels, where some data loss is unimportant, such as in some multimedia applications. However, creating many flush points can affect the speed as well as the amount (ratio) of compression.
Data length
There is no limit to the length of data that can be compressed or decompressed. Repeated calls to the library allow an unlimited number of blocks of data to be handled. Some ancillary code (counters) may suffer from overflow for long data streams, but this does not affect the actual compression or decompression.
When compressing a long (or infinite) data stream, it is advisable to write regular full-flush points.
Applications
Today, zlib is something of a '' de facto'' standard, to the point that zlib and DEFLATE are often used interchangeably in standards documents, with thousands of applications relying on it for compression, either directly or indirectly. These include:
* The Linux kernel
The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
, where zlib is used to implement compressed network protocols, compressed file systems, and to decompress the kernel image at boot time.
* GNU Binutils and GNU Debugger (GDB)
* libpng, the reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation ...
for the PNG image format, which specifies DEFLATE as the stream compression for its bitmap data.
* libwww, an API for web applications like web browser
A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
s.
* The Apache HTTP Server
The Apache HTTP Server ( ) is a free and open-source software, free and open-source cross-platform web server, released under the terms of Apache License, Apache License 2.0. It is developed and maintained by a community of developers under the ...
, which uses zlib to implement HTTP/1.1.
* Similarly, the cURL library uses zlib to decompress HTTP responses.
* The OpenSSH client and server, which rely on zlib to perform the optional compression offered by the Secure Shell protocol.
* The OpenSSL and GnuTLS security libraries, which can optionally use zlib to compress TLS connections.
* The FFmpeg multimedia library, which uses zlib to read and write the DEFLATE-compressed parts of stream formats, such as Matroska.
* The rsync remote file synchronizer, which uses zlib to implement optional protocol compression.
* The dpkg and RPM package manager
A package manager or package management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer in a consistent manner.
A package manager deals wi ...
s, which use zlib to unpack files from compressed software packages.
* The Apache Subversion and CVS version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
systems, which use zlib to compress traffic to and from remote repositories.
* The Apache ORC column-oriented data storage format use ZLib as its default compression method.
* The Git version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
system uses zlib to store the contents of its data objects (blobs, trees, commits and tags).
* The PostgreSQL RDBMS uses zlib with custom dump format (pg_dump -Fc) for database backups.
* The class System.IO.Compression.DeflateStream of the Microsoft .NET Framework 2.0 and higher.
* The "deflate" utility in TORNADO as part of VxWorks Operating System made by Wind River Systems uses zlib to compress boot ROM images.
* zlib-flate, raw zlib compression program, part of qpdf
* The MySQL
MySQL () is an Open-source software, open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A rel ...
RDBMS uses ZLib LZ77 for compression in InnoDB Tables
zlib is also used in many embedded devices, such as the Apple iPhone
The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
and Sony PlayStation 3
The PlayStation 3 (PS3) is a home video game console developed and marketed by Sony Computer Entertainment (SCE). It is the successor to the PlayStation 2, and both are part of the PlayStation brand of consoles. The PS3 was first released on ...
, because the code is portable, liberally licensed, and has a relatively small memory footprint
Memory footprint refers to the amount of main memory that a program uses or references while running.
The word footprint generally refers to the extent of physical dimensions that an object occupies, giving a sense of its size. In computing, t ...
.
Forks
A commonly used library built on an old codebase, zlib is also frequently forked by third-parties that claim improvements to this library:
* Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
has a high-performance fork of zlib.
* Cloudflare maintains a high-performance fork with "massive" improvements.
* zlib-ng is a zlib replacement fork for "next generation" systems. It removes workaround code for compilers that do not support ANSI C, integrates Cloudflare and Intel optimizations, adds hardware acceleration (SIMD
Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
and intrinsic function
In computer software, in compiler theory, an intrinsic function, also called built-in function or builtin function, is a function ( subroutine) available for use in a given programming language whose implementation is handled specially by the com ...
s), and uses code sanitizers, fuzzing
In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptio ...
, and code coverage
In software engineering, code coverage, also called test coverage, is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high code coverage has more of its ...
to help find bugs.
See also
* DEFLATE
* gzip
* LZ77 and LZ78 § LZ77
* Zip (file format)
ZIP is an archive file format that supports lossless compression, lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of Data compression, compr ...
* zlib License
* Zopfli
* List of archive formats
This is a list of file formats used by file archiver, archivers and data compression, compressors used to create Archive file, archive files.
Archive formats by purpose
Archive formats are used for backups, mobility, and archiving. Many archive ...
References
External links
* {{Official website, //zlib.net
1995 software
C (programming language) libraries
Free computer libraries
Free data compression software
Free software programmed in C
Software using the zlib license