In
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
, BCJ, short for branch/call/jump, refers to a technique that improves the compression of
machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
by replacing relative
branch addresses with absolute ones. This allows a
Lempel–Ziv compressor to identify duplicate targets and more efficiently encode them. On decompression, the inverse filter restores the original encoding. Different BCJ filters are used for different
instruction set
In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
s, as each use different opcodes for branching.
A form of BCJ is seen in Microsoft's
cabinet file format from 1996, which filters x86 CALL instructions for the
compressor. The
7z and
xz file formats implement BCJ for multiple architectures.
ZPAQ calls its x86 BCJ as "E8E9", after the opcode values.
bsdiff, a tool for
delta updates, circumvents the need of writing architecture-specific BCJ tools by encoding bytewise differences. This allows it to be much better than the "match and copy" type tools such as VCDIFF, giving an output size of only 6% for Google Chrome. However, Google's courgette, which adds a layer of explicit disassembly, is able to produce 9× smaller diffs.
Effect
For a
squashfs
Squashfs is a compressed read-only file system for Linux. Squashfs compresses files, inodes and directories, and supports block sizes from 4 KiB up to 1 MiB for greater compression. Several compression algorithms are supported. Squashfs is ...
image of a
Fedora Linux 31 live image, using x86 BCJ saves an extra 30 MB out of the ~1.7 GB compressed size, but doubles the installation time.
References
Data compression software
Algorithms on strings
{{algorithm-stub