Zstd
Zstandard is a lossless data compression algorithm developed by Yann Collet at Facebook. Zstd is the corresponding reference implementation in C, released as open-source software on 31 August 2016. The algorithm was published in 2018 as , which also defines an associated media type "application/zstd", filename extension "zst", and HTTP content encoding "zstd". Features Zstandard was designed to give a compression ratio comparable to that of the DEFLATE algorithm (developed in 1991 and used in the original ZIP and gzip programs), but faster, especially for decompression. It is tunable with compression levels ranging from negative 7 (fastest) to 22 (slowest in compression speed, but best compression ratio). Starting from version 1.3.2 (October 2017), zstd optionally implements very-long-range search and deduplication (, 128 MiB window) similar to rzip or lrzip. Compression speed can vary by a factor of 20 or more between the fastest and slowest levels, while decom ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Asymmetric Numeral Systems
Asymmetric numeral systems (ANS)J. Duda, K. Tahboub, N. J. Gadil, E. J. Delp''The use of asymmetric numeral systems as an accurate replacement for Huffman coding'' Picture Coding Symposium, 2015.J. Duda''Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding'' arXiv:1311.2540, 2013. is a family of entropy encoding methods introduced by Jarosław (Jarek) Duda from Jagiellonian University, used in data compression since 2014 due to improved performance compared to previous methods. ANS combines the compression ratio of arithmetic coding (which uses a nearly accurate probability distribution), with a processing cost similar to that of Huffman coding. In the tabled ANS (tANS) variant, this is achieved by constructing a finite-state machine to operate on a large alphabet without using multiplication. Among others, ANS is used in the Facebook Zstandard compressor [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Media Type
In information and communications technology, a media type, content type or MIME type is a two-part identifier for file formats and content formats. Their purpose is comparable to filename extensions and uniform type identifiers, in that they identify the intended data format. They are mainly used by technologies underpinning the Internet, and also used on Linux desktop systems. The Internet Assigned Numbers Authority (IANA) is the official authority for the standardization and publication of these classifications. Media types were originally defined in Request for Comments (MIME) Part One: Format of Internet Message Bodies (Nov 1996) in November 1996 as a part of the ''MIME (Multipurpose Internet Mail Extensions)'' specification, for denoting type of email message content and attachments; hence the original name, ''MIME type''. Media types are also used by other internet protocols such as HTTP, document file formats such as HTML, and the XDG specifications implemented by Linu ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
HTTP Compression
HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization. HTTP data is compressed before it is sent from the server: compliant browsers will announce what methods are supported to the server before downloading the correct format; browsers that do not support compliant compression method will download uncompressed data. The most common compression schemes include gzip and Brotli; a full list of available schemes is maintained by the IANA. There are two different ways compression can be done in HTTP. At a lower level, a Transfer-Encoding header field may indicate the payload of an HTTP message is compressed. At a higher level, a Content-Encoding header field may indicate that a resource being transferred, cached, or otherwise referenced is compressed. Compression using Content-Encoding is more widely supported than Transfer-Encoding, and some browsers do not advertise support for Transfer-Encoding co ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Rzip
rzip is a huge-scale data compression computer program designed around initial LZ77-style string matching on a 900 MB dictionary window, followed by bzip2-based Burrows–Wheeler transform and entropy coding ( Huffman) on 900 kB output chunks. Compression algorithm rzip operates in two stages. The first stage finds and encodes large chunks of duplicated data over potentially very long distances (900 MB) in the input file. The second stage uses a standard compression algorithm (bzip2) to compress the output of the first stage. It is quite common these days to need to compress files that contain long distance redundancies. For example, when compressing a set of home directories several users might have copies of the same file, or of quite similar files. It is also common to have a single file that contains large duplicated chunks over long distances, such as PDF files containing repeated copies of the same image. Most compression programs won't be able to take adv ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Huffman Coding
In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A. Huffman while he was a Doctor of Science, Sc.D. student at Massachusetts Institute of Technology, MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes". The output from Huffman's algorithm can be viewed as a variable-length code table for encoding a source symbol (such as a character in a file). The algorithm derives this table from the estimated probability or frequency of occurrence (''weight'') for each possible value of the source symbol. As in other entropy encoding methods, more common symbols are generally represented using fewer bits than less common symbols. Huffman's method can be efficiently implemented, finding a code in time linear time, linear to the number of input weigh ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Entropy Encoding
In information theory, an entropy coding (or entropy encoding) is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method must have an expected code length greater than or equal to the entropy of the source. More precisely, the source coding theorem states that for any source distribution, the expected code length satisfies \operatorname E_ ell(d(x))\geq \operatorname E_ \log_b(P(x))/math>, where \ell is the function specifying the number of symbols in a code word, d is the coding function, b is the number of symbols used to make output codes and P is the probability of the source symbol. An entropy coding attempts to approach this lower bound. Two of the most common entropy coding techniques are Huffman coding and arithmetic coding. If the approximate entropy characteristics of a data stream are known in advance (especially for signal compression), a simple ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
LZ77
LZ77 and LZ78 are the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. They are also known as Lempel-Ziv 1 (LZ1) and Lempel-Ziv 2 (LZ2) respectively. These two algorithms form the basis for many variations including Lempel–Ziv–Welch, LZW, Lempel–Ziv–Storer–Szymanski, LZSS, Lempel–Ziv–Markov chain algorithm, LZMA and others. Besides their academic influence, these algorithms formed the basis of several ubiquitous compression schemes, including GIF and the DEFLATE algorithm used in Portable Network Graphics, PNG and Zip (file format), ZIP. They are both theoretically dictionary coders. LZ77 maintains a sliding window during compression. This was later shown to be equivalent to the ''explicit dictionary'' constructed by LZ78—however, they are only equivalent when the entire data is intended to be decompressed. Since LZ77 encodes and decodes from a sliding window over previously seen characters, decompressio ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Log Files
In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or broad information on current operations. These events may occur in the operating system or in other software. A message or ''log entry'' is recorded for each such event. These log messages can then be used to monitor and understand the operation of the system, to debug problems, or during an audit. Logging is particularly important in multi-user software, to have a central overview of the operation of the system. In the simplest case, messages are written to a file, called a ''log file''. Alternatively, the messages may be written to a dedicated logging system or to a log management software, where it is stored in a database or on a different computer system. Specifically, a ''transaction log'' is a log of the communications between a system and the users of that system, or a data collection method that automatically captures the type, content, or time of tran ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Fermilab
Fermi National Accelerator Laboratory (Fermilab), located in Batavia, Illinois, near Chicago, is a United States Department of Energy United States Department of Energy National Labs, national laboratory specializing in high-energy particle physics. Fermilab's Main Injector, two miles (3.3 km) in circumference, is the laboratory's most powerful particle accelerator. The accelerator complex that feeds the Main Injector is under upgrade, and construction of the first building for the new PIP-II linear accelerator began in 2020. Until 2011, Fermilab was the home of the 6.28 km (3.90 mi) circumference Tevatron accelerator. The ring-shaped tunnels of the Tevatron and the Main Injector are visible from the air and by satellite. Fermilab aims to become a world center in neutrino physics. It is the host of the multi-billion dollar Deep Underground Neutrino Experiment (DUNE) now under construction. The project has suffered delays and, in 2022, the journals ''Science'' and ''Sc ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Pareto Frontier
In multi-objective optimization, the Pareto front (also called Pareto frontier or Pareto curve) is the set of all Pareto efficient solutions. The concept is widely used in engineering. It allows the designer to restrict attention to the set of efficient choices, and to make tradeoffs within this set, rather than considering the full range of every parameter. Definition The Pareto frontier, ''P''(''Y''), may be more formally described as follows. Consider a system with function f: X \rightarrow \mathbb^m, where ''X'' is a compact set of feasible decisions in the metric space \mathbb^n, and ''Y'' is the feasible set of criterion vectors in \mathbb^m, such that Y = \. We assume that the preferred directions of criteria values are known. A point y^ \in \mathbb^m is preferred to (strictly dominates) another point y^ \in \mathbb^m, written as y^ \succ y^. The Pareto frontier is thus written as: : P(Y) = \. Marginal rate of substitution A significant aspect of the Pareto fronti ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Bzip2
bzip2 is a free and open-source file compression program that uses the Burrows–Wheeler algorithm. It only compresses single files and is not a file archiver. It relies on separate external utilities such as tar for tasks such as handling multiple files, and other tools for encryption, and archive splitting. bzip2 was initially released in 1996 by Julian Seward. It compresses most files more effectively than older LZW and Deflate compression algorithms but is slower. bzip2 is particularly efficient for text data, and decompression is relatively fast. The algorithm uses several layers of compression techniques, such as run-length encoding (RLE), Burrows–Wheeler transform (BWT), move-to-front transform (MTF), and Huffman coding. bzip2 compresses data in blocks between 100 and 900 kB and uses the Burrows–Wheeler transform to convert frequently recurring character sequences into strings of identical letters. The move-to-front transform and Huffman coding are then ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
LHA (file Format)
LHA or LZH is a freeware compression utility and associated file format. It was created in 1988 by , a medical doctor, and originally named LHarc. A complete rewrite of LHarc, tentatively named ''LHx'', was eventually released as ''LH''. It was then renamed to ''LHA'' to avoid conflicting with the then-new MS-DOS 5.0 ("load high") command. The original LHA and its Windows port, LHA32, are no longer in development because Yoshizaki is busy at his day job. Although no longer much used in the west, LHA remained popular in Japan until the 2000s. It was used by id Software to compress installation files for their earlier games, including '' Doom'' and '' Quake''. Because some versions of LHA have been distributed with source code under the permissive license, LHA has been ported to many operating systems and is still the main archiving format used on the Amiga computer, although it competed with LZX in the mid-1990s. This was due to Aminet, the world's largest archive of Amiga-rel ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |