Block suballocation is a feature of some computer
file systems which allows large
blocks or allocation units to be used while making efficient use of empty space at the end of large files, space which would otherwise be lost for other use to
internal fragmentation.
In file systems that don't support fragments, this feature is also called tail merging or tail packing because it is commonly done by packing the "tail", or last partial block, of multiple files into a single block.
Rationale
File systems have traditionally divided the disk into equally sized blocks to simplify their design and limit the worst-case
fragmentation. Block sizes are typically multiples of 512 bytes due to the size of hard
disk sector
In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. For most disks, each sector stores a fixed amount of user-accessible data, traditionally 512 bytes for hard disk drives (HDDs), and 2048 byt ...
s. When files are allocated by some traditional file systems, only whole blocks can be allocated to individual files. But as file sizes are often not multiples of the file system block size, this design inherently results in the last blocks of files (called tails) occupying only a part of the block, resulting in what is called
internal fragmentation (not to be confused with
external fragmentation). This waste of space can be significant if the file system stores many small files and can become critical when attempting to use higher block sizes to improve performance.
UFS and other derived UNIX file systems support fragments which greatly mitigate this effect.
Suballocation schemes
Block suballocation addresses this problem by dividing up a tail block in some way to allow it to store fragments from other files.
Some block suballocation schemes can perform allocation at the byte level; most, however, simply divide up the block into smaller ones (the divisor usually being some power of 2). For example, if a 38
KiB file is to be stored in a
file system using 32 KiB blocks, the file would normally span two blocks, or 64 KiB, for storage; the remaining 26 KiB of the second block becomes unused slack space. With an 8 KiB block suballocation, however, the file would occupy just 6 KiB of the second block, leave 2 KiB (of the 8 KiB suballocation block) slack and free the other 24 KiB of the block for other files.
Tail packing
Some file systems have since been designed to take advantage of this unused space, and can pack the tails of several files in a single shared tail block. While this may, at first, seem like it would significantly increase file system fragmentation, the negative effect can be mitigated with
readahead features on modern
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s – when dealing with short files, several tails may be close enough to each another to be read together, and thus a
disk seek is not introduced. Such file systems often employ
heuristic
A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...
s in order to determine whether tail packing is worthwhile in a given situation,
and
defragmentation software may use a more evolved heuristic.
Efficiency
In some scenarios where the majority of files are shorter than half the block size, such as in a folder of small
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
files or small bitmap images, tail packing can increase storage efficiency even more than twofold, compared to file systems without tail packing.
This not only translates into conservation of disk space, but may also introduce performance increases, as due to higher
locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
, less data has to be read, also translating into higher
page cache
In computing, a page cache, sometimes also called disk cache, is a transparent cache for the pages originating from a secondary storage device such as a hard disk drive (HDD) or a solid-state drive (SSD). The operating system keeps a page ca ...
efficiency. However, these advantages can be negated by the increased complexity of
implementation
Implementation is the realization of an application, execution of a plan, idea, scientific modelling, model, design, specification, Standardization, standard, algorithm, policy, or the Management, administration or management of a process or Goal ...
.
, the most widely used read-write file systems with support for block suballocation are
Btrfs and
FreeBSD UFS2 (where it is called "
block level fragmentation").
ReiserFS and
Reiser4 also support tail packing.
Several read-only file systems do not use blocks at all and are thus implicitly using space as efficiently as suballocating file systems; such file systems double as
archive formats.
See also
*
File system
*
Internal fragmentation
*
Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
*
Comparison of file systems
The following tables compare general and technical information for a number of file systems.
General information
Metadata
All widely used file systems record a last modified time stamp (also known as "mtime"). It is not included i ...
References
*
{{refend
Computer file systems