zlib ( or "
zeta
Zeta (, ; uppercase Ζ, lowercase ζ; , , classical or ''zē̂ta''; ''zíta'') is the sixth letter of the Greek alphabet. In the system of Greek numerals, it has a value of 7. It was derived from the Phoenician alphabet, Phoenician letter zay ...
-lib", ) is a
software library
In computing, a library is a collection of resources that can be leveraged during software development to implement a computer program. Commonly, a library consists of executable code such as compiled functions and classes, or a library can ...
used for
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
as well as a data format.
[ zlib was written by Jean-loup Gailly and Mark Adler and is an ]abstraction
Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods.
"An abstraction" ...
of the DEFLATE compression algorithm used in their gzip
gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and ...
file compression program. zlib is also a crucial component of many software platforms, including Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
, and iOS
Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
. It has also been used in gaming consoles such as the PlayStation 4
The PlayStation 4 (PS4) is a home video game console developed by Sony Interactive Entertainment. Announced as the successor to the PlayStation 3 in February 2013, it was launched on November 15, 2013, in North America, November 29, 2013, in ...
, PlayStation 3
The PlayStation 3 (PS3) is a home video game console developed and marketed by Sony Computer Entertainment (SCE). It is the successor to the PlayStation 2, and both are part of the PlayStation brand of consoles. The PS3 was first released on ...
, Wii U
The Wii U ( ) is a home video game console developed by Nintendo as the successor to the Wii. Released in late 2012, it is the first eighth-generation video game console and competed with Microsoft's Xbox One and Sony's PlayStation 4.
The W ...
, Wii, Xbox One
The Xbox One is a home video game console developed by Microsoft. Announced in May 2013, it is the successor to Xbox 360 and the third console in the Xbox#Consoles, Xbox series. It was first released in North America, parts of Europe, Austra ...
and Xbox 360
The Xbox 360 is a home video game console developed by Microsoft. As the successor to the Xbox (console), original Xbox, it is the second console in the Xbox#Consoles, Xbox series. It was officially unveiled on MTV on May 12, 2005, with detail ...
.
The first public version of Zlib, 0.9, was released on 1 May 1995 and was originally intended for use with the libpng image library. It is free software
Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
, distributed under the zlib License
The zlib license is a permissive software license which defines the terms under which the zlib software library can be distributed. It is also used by many other open-source packages. The libpng library uses a similar license, libpng license, s ...
.
Capabilities
Encapsulation
Raw DEFLATE compressed data (RFC 1951) are typically written with a zlib or gzip wrapper encapsulating the data, by adding a header and footer. This provides stream identification and error detection that are not provided by the raw DEFLATE data.
The zlib wrapper (RFC 1950) is smaller than the gzip wrapper (RFC 1952), as the latter stores a file name and other file system information.
Algorithm
, zlib only supports one algorithm, called DEFLATE, which uses a combination of a variation of LZ77
LZ77 and LZ78 are the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978.
They are also known as Lempel-Ziv 1 (LZ1) and Lempel-Ziv 2 (LZ2) respectively. These two algorithms form the basis ...
(Lempel–Ziv 1977) and Huffman coding
In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by ...
. This algorithm provides good compression on a wide variety of data with minimal use of system resources. This is also the algorithm used in the Zip archive format. The header makes allowance for other algorithms, but none are currently implemented.
Resource use
zlib provides facilities for control of processor and memory use. A compression level value may be supplied that trades speed for compression. There are also facilities for conserving memory, useful in restricted memory environments, such as some embedded systems.
Strategy
The compression can be optimized for specific types of data. If one is using the library to always compress specific types of data, then using a specific strategy may improve compression and performance. For example, if the data contain long lengths of repeated bytes, the run-length encoding
Run-length encoding (RLE) is a form of lossless data compression in which ''runs'' of data (consecutive occurrences of the same data value) are stored as a single occurrence of that data value and a count of its consecutive occurrences, rather th ...
(RLE) strategy may give good results at a higher speed. For general data, the default strategy is preferred.
Error handling
Errors in compressed data may be detected and skipped. Further, if "full-flush" points are written to the compressed stream, then corrupt data can be skipped, and the decompression will resynchronize at the next flush point—although no error recovery of the corrupt data is provided. Full-flush points are useful for large data streams on unreliable channels, where some data loss
Data loss is an error condition in information systems in which information is destroyed by failures (like failed spindle motors or head crashes on hard drives) or neglect (like mishandling, careless handling or storage under unsuitable conditions) ...
is unimportant, such as in some multimedia applications. However, creating many flush points can affect the speed as well as the amount (ratio) of compression.
Data length
There is no limit to the length of data that can be compressed or decompressed. Repeated calls to the library allow an unlimited number of blocks of data to be handled. Some ancillary code (counters) may suffer from overflow for long data streams, but this does not affect the actual compression or decompression.
When compressing a long (or infinite) data stream, it is advisable to write regular full-flush points.
Applications
Today, zlib is something of a '' de facto'' standard, to the point that zlib and DEFLATE are often used interchangeably in standards documents, with thousands of applications relying on it for compression, either directly or indirectly. These include:
* The Linux kernel
The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
, where zlib is used to implement compressed network protocols, compressed file systems, and to decompress the kernel image at boot time.
* GNU Binutils and GNU Debugger
The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and par ...
(GDB)
* libpng, the reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation ...
for the PNG image format, which specifies DEFLATE as the stream compression for its bitmap
In computing, a bitmap (also called raster) graphic is an image formed from rows of different colored pixels. A GIF is an example of a graphics image file that uses a bitmap.
As a noun, the term "bitmap" is very often used to refer to a partic ...
data.
* libwww
Libwww is an early World Wide Web software library providing core functions for web browsers, implementing HTML, HTTP, and other technologies. Tim Berners-Lee, at the European Organization for Nuclear Research (CERN), released libwww (then also ca ...
, an API for web applications like web browser
A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
s.
* The Apache HTTP Server
The Apache HTTP Server ( ) is a free and open-source software, free and open-source cross-platform web server, released under the terms of Apache License, Apache License 2.0. It is developed and maintained by a community of developers under the ...
, which uses zlib to implement HTTP/1.1.
* Similarly, the cURL library uses zlib to decompress HTTP responses.
* The OpenSSH client and server, which rely on zlib to perform the optional compression offered by the Secure Shell
The Secure Shell Protocol (SSH Protocol) is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution.
SSH was designed for ...
protocol.
* The OpenSSL
OpenSSL is a software library for applications that provide secure communications over computer networks against eavesdropping, and identify the party at the other end. It is widely used by Internet servers, including the majority of HTTPS web ...
and GnuTLS security libraries, which can optionally use zlib to compress TLS connections.
* The FFmpeg
FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing vide ...
multimedia library, which uses zlib to read and write the DEFLATE-compressed parts of stream formats, such as Matroska
Matroska (styled Matroška) is a project to create a container format that can hold an unlimited number of video, audio, picture, or subtitle tracks in one file. The Matroska Multimedia Container is similar in concept to other containers like ...
.
* The rsync remote file synchronizer, which uses zlib to implement optional protocol compression.
* The dpkg and RPM
Revolutions per minute (abbreviated rpm, RPM, rev/min, r/min, or r⋅min−1) is a unit of rotational speed (or rotational frequency) for rotating machines.
One revolution per minute is equivalent to hertz.
Standards
ISO 80000-3:2019 def ...
package manager
A package manager or package management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer in a consistent manner.
A package manager deals wi ...
s, which use zlib to unpack files from compressed software packages.
* The Apache Subversion
Apache Subversion (often abbreviated SVN, after its command name ''svn'') is a version control system distributed as open source under the Apache License. Software developers use Subversion to maintain current and historical versions of files su ...
and CVS version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
systems, which use zlib to compress traffic to and from remote repositories.
* The Apache ORC column-oriented data storage format use ZLib as its default compression method.
* The Git
Git () is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively.
Design goals of Git include speed, data integrity, and suppor ...
version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
system uses zlib to store the contents of its data objects (blobs, trees, commits and tags).
* The PostgreSQL
PostgreSQL ( ) also known as Postgres, is a free and open-source software, free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transaction processing, transactions ...
RDBMS uses zlib with custom dump format (pg_dump -Fc) for database backups.
* The class System.IO.Compression.DeflateStream of the Microsoft .NET Framework 2.0 and higher.
* The "deflate" utility in TORNADO as part of VxWorks
VxWorks is a real-time operating system (or RTOS) developed as proprietary software by Wind River Systems, a subsidiary of Aptiv. First released in 1987, VxWorks is designed for use in embedded systems requiring real-time, Deterministic system, ...
Operating System made by Wind River Systems
Wind River Systems, Inc., also known as Wind River (trademarked as Wndrvr), is an Alameda, California–based company, subsidiary of Aptiv PLC. The company develops embedded system and cloud software consisting of real-time operating systems sof ...
uses zlib to compress boot ROM images.
* zlib-flate, raw zlib compression program, part of qpdf
* The MySQL
MySQL () is an Open-source software, open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A rel ...
RDBMS uses ZLib LZ77 for compression in InnoDB Tables
zlib is also used in many embedded devices, such as the Apple iPhone
The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
and Sony PlayStation 3
The PlayStation 3 (PS3) is a home video game console developed and marketed by Sony Computer Entertainment (SCE). It is the successor to the PlayStation 2, and both are part of the PlayStation brand of consoles. The PS3 was first released on ...
, because the code is portable, liberally licensed, and has a relatively small memory footprint
Memory footprint refers to the amount of main memory that a program uses or references while running.
The word footprint generally refers to the extent of physical dimensions that an object occupies, giving a sense of its size. In computing, t ...
.
Forks
A commonly used library built on an old codebase, zlib is also frequently forked by third-parties that claim improvements to this library:
* Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
has a high-performance fork of zlib.
* Cloudflare
Cloudflare, Inc., is an American company that provides content delivery network services, cybersecurity, DDoS mitigation, wide area network services, reverse proxies, Domain Name Service, ICANN-accredited domain registration, and other se ...
maintains a high-performance fork with "massive" improvements.
* zlib-ng is a zlib replacement fork for "next generation" systems. It removes workaround code for compilers that do not support ANSI C
ANSI C, ISO C, and Standard C are successive standards for the C programming language published by the American National Standards Institute (ANSI) and ISO/IEC JTC 1/SC 22/WG 14 of the International Organization for Standardization (ISO) and the ...
, integrates Cloudflare and Intel optimizations, adds hardware acceleration (SIMD
Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
and intrinsic function
In computer software, in compiler theory, an intrinsic function, also called built-in function or builtin function, is a function ( subroutine) available for use in a given programming language whose implementation is handled specially by the com ...
s), and uses code sanitizers, fuzzing
In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptio ...
, and code coverage
In software engineering, code coverage, also called test coverage, is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high code coverage has more of its ...
to help find bugs.
See also
* DEFLATE
* gzip
gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and ...
* LZ77 and LZ78 § LZ77
* Zip (file format)
ZIP is an archive file format that supports lossless compression, lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of Data compression, compr ...
* zlib License
The zlib license is a permissive software license which defines the terms under which the zlib software library can be distributed. It is also used by many other open-source packages. The libpng library uses a similar license, libpng license, s ...
* Zopfli
* List of archive formats
This is a list of file formats used by file archiver, archivers and data compression, compressors used to create Archive file, archive files.
Archive formats by purpose
Archive formats are used for backups, mobility, and archiving. Many archive ...
References
External links
* {{Official website, //zlib.net
1995 software
C (programming language) libraries
Free computer libraries
Free data compression software
Free software programmed in C
Software using the zlib license