Defragmenting
   HOME

TheInfoList



OR:

In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the
mass storage In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion. In general, the term ''mass'' in ''mass storage'' is used to mean ''large'' in relation to contemporaneous hard disk drive ...
device used to store files into the smallest number of
contiguous Contiguity or contiguous may refer to: *Contiguous data storage, in computer science *Contiguity (probability theory) *Contiguity (psychology) *Contiguous distribution of species, in biogeography *Geographic contiguity Geographic contiguity is t ...
regions (fragments, ''extents''). It also attempts to create larger regions of free space using compaction to impede the return of fragmentation. Defragmentation is advantageous and relevant to file systems on electromechanical
disk drives Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are cons ...
(
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
s,
floppy disk drives A floppy disk or floppy diskette (casually referred to as a floppy, a diskette, or a disk) is a type of disk storage composed of a thin and flexible disk of a magnetic storage medium in a square or nearly square plastic enclosure lined with a ...
and optical disk media). The movement of the hard drive's read/write heads over different areas of the disk when accessing fragmented files is slower, compared to accessing the entire contents of a non-fragmented file sequentially without moving the read/write heads to
seek SEEK Limited is an Australian employment website for job listings, headquartered in Melbourne, Victoria. Seek also operates in China, Hong Kong, Indonesia, Malaysia, New Zealand, Philippines, Singapore and Thailand. History Seek was founded ...
other fragments.


Causes of fragmentation

Fragmentation occurs when the file system cannot or will not allocate enough contiguous space to store a complete file as a unit, but instead puts parts of it in gaps between existing files (usually those gaps exist because they formerly held a file that the file system has subsequently deleted or because the file system allocated excess space for the file in the first place). Files that are often appended to (as with log files) as well as the frequent adding and deleting of files (as with emails and web browser cache), larger files (as with videos) and greater numbers of files contribute to fragmentation and consequent performance loss. Defragmentation attempts to alleviate these problems.


Example

An otherwise blank disk has five files, A through E, each using 10 blocks of space (for this section, a ''block'' is an allocation unit of the filesystem; the block size is set when the disk is formatted and can be any size supported by the filesystem). On a blank disk, all of these files would be allocated one after the other (see example 1 in the image). If file B were to be deleted, there would be two options: mark the space for file B as empty to be used again later, or move all the files after B so that the empty space is at the end. Since moving the files could be time-consuming if there were many files which needed to be moved, usually the empty space is simply left there, marked in a table as available for new files (see example 2 in the image). When a new file, F, is allocated requiring 6 blocks of space, it could be placed into the first 6 blocks of the space that formerly held file B, and the 4 blocks following it will remain available (see example 3 in the image). If another new file, G, is added and needs only 4 blocks, it could then occupy the space after F and before C (example 4 in the image) Each file section becomes stored in separate packets across the drive, by deleting this, blank spaces are left creating a fragmentated disk. However, if file F then needs to be expanded, there are three options, since the space immediately following it is no longer available: # Move the file F to where it can be created as one contiguous file of the new, larger size. This would not be possible if the file is larger than the largest contiguous space available. The file could also be so large that the operation would take an undesirably long period of time. # Move all the files after F until one opens enough space to make it contiguous again. This presents the same problem as in the previous example: if there are a small number of files or not much data to move, it isn't a big problem, but if there are thousands or even tens of thousands of files, there isn't enough time to move all those files. # Add a new block somewhere else, and indicate that F has a second ''extent'' (see example 5 in the image). Repeat this hundreds of times and the filesystem will have a number of small free segments scattered in many places, and some files will have multiple extents. When a file has many extents like this, access time for that file may become excessively long because of all the random seeking the disk will have to do when reading it. Additionally, the concept of “fragmentation” is not only limited to individual files that have multiple extents on the disk. For instance, a group of files normally read in a particular sequence (like files accessed by a program when it is loading, which can include certain DLLs, various resource files, the audio/visual media files in a game) can be considered fragmented if they are not in sequential load-order on the disk, even if these individual files are ''not'' fragmented; the read/write heads will have to seek these (non-fragmented) files randomly to access them in sequence. Some groups of files may have been originally installed in the correct sequence, but drift apart with time as certain files within the group are deleted. Updates are a common cause of this, because in order to update a file, most updaters usually delete the old file first, and then write a new, updated one in its place. However, most filesystems do not write the new file in the same physical place on the disk. This allows unrelated files to fill in the empty spaces left behind.


Mitigation

Defragmentation is the operation of moving file extents (physical allocation blocks) so they eventually merge, preferably into one. Doing so usually requires at least two copy operations: one to move the blocks into some free scratch space on the disk so more movement can happen, and another to finally move the blocks into their intended place. In such a paradigm, no data is ever removed from the disk, so that the operation can be safely stopped even in the event of a power loss. The article picture depicts an example. To defragment a disk, defragmentation software (also known as a "defragmenter") can only move files around within the free space available. This is an intensive operation and cannot be performed on a filesystem with little or no free space. During defragmentation, system performance will be degraded, and it is best to leave the computer alone during the process so that the defragmenter does not get confused by unexpected changes to the filesystem. Depending on the algorithm used it may or may not be advantageous to perform multiple passes. The reorganization involved in defragmentation does not change logical location of the files (defined as their location within the directory structure). Besides defragmenting program files, the defragmenting tool can also reduce the time it takes to load programs and open files. For example, the Windows 9x defragmenter included the Intel Application Launch Accelerator which optimized programs on the disk by placing the defragmented program files and their dependencies next to each other, in the order in which the program loads them, to load these programs faster. In Windows, a good defragmenter will read the
Prefetch Prefetching is a technique used in computing to improve performance by retrieving data or instructions before they are needed. By predicting what a program will request in the future, the system can load information in advance to reduced wait times ...
files to identify as many of these file groups as possible and place the files within them in access sequence. At the beginning of the hard drive, the outer tracks have a higher data transfer rate than the inner tracks. Placing frequently accessed files onto the outer tracks increases performance. Third party defragmenters, such as MyDefrag, will move frequently accessed files onto the outer tracks and defragment these files. Improvements in modern hard drives such as
RAM Ram, ram, or RAM most commonly refers to: * A male sheep * Random-access memory, computer memory * Ram Trucks, US, since 2009 ** List of vehicles named Dodge Ram, trucks and vans ** Ram Pickup, produced by Ram Trucks Ram, ram, or RAM may also ref ...
cache, faster platter rotation speed, command queuing (
SCSI Small Computer System Interface (SCSI, ) is a set of standards for physically connecting and transferring data between computers and peripheral devices, best known for its use with storage devices such as hard disk drives. SCSI was introduced ...
/ ATA TCQ or
SATA SATA (Serial AT Attachment) is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives. Serial ATA succeeded the earlier Parallel ATA (PATA) standard ...
NCQ), and greater data density reduce the negative impact of fragmentation on system performance to some degree, though increases in commonly used data quantities offset those benefits. However, modern systems profit enormously from the huge disk capacities currently available, since partially filled disks fragment much less than full disks, and on a high-capacity HDD, the same partition occupies a smaller range of cylinders, resulting in faster seeks. However, the average
access time Access time is the time delay or latency between a request to an electronic system, and the access being initiated or the requested data returned. In computer and software systems, it is the time interval between the point where an instructio ...
can never be lower than a half rotation of the platters, and platter rotation (measured in rpm) is the speed characteristic of HDDs which has experienced the slowest growth over the decades (compared to data transfer rate and seek time), so minimizing the number of seeks remains beneficial in most storage-heavy applications. Defragmentation is just that: ensuring that there is at most one seek per file, counting only the seeks to non-adjacent tracks.


Partitioning

A common strategy to optimize defragmentation and to reduce the impact of fragmentation is to partition the hard disk(s) in a way that separates partitions of the file system that experience many more reads than writes from the more volatile zones where files are created and deleted frequently. The directories that contain the users' profiles are modified constantly (especially with the Temp directory and web browser cache creating thousands of files that are deleted in a few days). If files from user profiles are held on a dedicated partition (as is commonly done on
UNIX Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
recommended files systems, where it is typically stored in the /var partition), the defragmenter runs better since it does not need to deal with all the static files from other directories. (Alternatively, a defragmenter can be told to simply exclude certain file paths.) For partitions with relatively little write activity, defragmentation time greatly improves after the first defragmentation, since the defragmenter will need to defragment only a small number of new files in the future.


Offline defragmentation

The presence of immovable system files, especially a
swap file In computer operating systems, memory paging is a memory management scheme that allows the physical memory used by a program to be non-contiguous. This also helps avoid the problem of memory fragmentation and requiring compaction to reduce fra ...
, can impede defragmentation. These files can be safely moved when the operating system is not in use. For example,
ntfsresize ntfsresize is a free Unix utility that non-destructively resizes the NTFS filesystem used by Windows NT 4.0, 2000, XP, 2003, Vista, 7, 8, 10, and 11 typically on a hard-disk partition. All NTFS versions used by 32-bit and 64-bit Windows ...
moves these files to resize an
NTFS NT File System (NTFS) (commonly called ''New Technology File System'') is a proprietary journaling file system developed by Microsoft in the 1990s. It was developed to overcome scalability, security and other limitations with File Allocation Tabl ...
partition. The tool PageDefrag could defragment Windows system files such as the swap file and the files that store the
Windows registry The Windows Registry is a hierarchical database that stores low-level settings for the Microsoft Windows operating system and for applications that opt to use the registry. The kernel, device drivers, services, Security Accounts Manager, a ...
by running at boot time before the GUI is loaded. Since Windows Vista, the feature is not fully supported and has not been updated. In NTFS, as files are added to the disk, the
Master File Table NT File System (NTFS) (commonly called ''New Technology File System'') is a proprietary journaling file system developed by Microsoft in the 1990s. It was developed to overcome scalability, security and other limitations with FAT. NTFS adds sever ...
(MFT) must grow to store the information for the new files. Every time the MFT cannot be extended due to some file being in the way, the MFT will gain a fragment. In early versions of Windows, it could not be safely defragmented while the partition was mounted, and so Microsoft wrote a hardblock in the defragmenting
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
. However, since
Windows XP Windows XP is a major release of Microsoft's Windows NT operating system. It was released to manufacturing on August 24, 2001, and later to retail on October 25, 2001. It is a direct successor to Windows 2000 for high-end and business users a ...
, an increasing number of defragmenters are now able to defragment the MFT, because the Windows defragmentation API has been improved and now supports that move operation. Even with the improvements, the first four clusters of the MFT remain unmovable by the Windows defragmentation API, resulting in the fact that some defragmenters will store the MFT in two fragments: The first four clusters wherever they were placed when the disk was formatted, and then the rest of the MFT at the beginning of the disk (or wherever the defragmenter's strategy deems to be the best place).


Solid-state disks

When reading data from a conventional electromechanical hard disk drive, the disk controller must first position the head, relatively slowly, to the track where a given fragment resides, and then wait while the disk platter rotates until the fragment reaches the head. A
solid-state drive A solid-state drive (SSD) is a type of solid-state storage device that uses integrated circuits to store data persistently. It is sometimes called semiconductor storage device, solid-state device, or solid-state disk. SSDs rely on non- ...
(SSD) is based on
flash memory Flash memory is an Integrated circuit, electronic Non-volatile memory, non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for t ...
with no moving parts, so
random access Random access (also called direct access) is the ability to access an arbitrary element of a sequence in equal time or any datum from a population of addressable elements roughly as easily and efficiently as any other, no matter how many elemen ...
of a file fragment on flash memory does not suffer this delay, making defragmentation to optimize access speed unnecessary. Furthermore, since flash memory can be written to only a limited number of times before it fails, defragmentation is actually detrimental (except in the mitigation of
catastrophic failure A catastrophic failure is a sudden and total failure from which recovery is impossible. Catastrophic failures often lead to cascading systems failure. The term is most commonly used for structural failures, but has often been extended to many ot ...
). However, Windows still defragments an SSD automatically (albeit less vigorously) to prevent the file system from reaching its maximum fragmentation tolerance (when the metadata can’t represent any more file fragments). Once the maximum fragmentation limit is reached, subsequent attempts to write to disk fail.


Approach and defragmenters by file-system type

*
FAT In nutrition science, nutrition, biology, and chemistry, fat usually means any ester of fatty acids, or a mixture of such chemical compound, compounds, most commonly those that occur in living beings or in food. The term often refers specif ...
: MS-DOS 6.x and Windows 9x-systems come with a defragmentation utility called Defrag. The
DOS DOS (, ) is a family of disk-based operating systems for IBM PC compatible computers. The DOS family primarily consists of IBM PC DOS and a rebranded version, Microsoft's MS-DOS, both of which were introduced in 1981. Later compatible syste ...
version is a limited version of Norton SpeedDisk. The version that came with Windows 9x was licensed from
Symantec Corporation Gen Digital Inc. (formerly Symantec Corporation and NortonLifeLock Inc.) is a multinational software company co-headquartered in both Prague, Czech Republic ( EU) and Tempe, Arizona ( USA). The company provides cybersecurity software and servi ...
, and the version that came with Windows 2000 and XP is licensed from Condusiv Technologies. *
NTFS NT File System (NTFS) (commonly called ''New Technology File System'') is a proprietary journaling file system developed by Microsoft in the 1990s. It was developed to overcome scalability, security and other limitations with File Allocation Tabl ...
was introduced with
Windows NT 3.1 Windows NT 3.1 is the first major release of the Windows NT operating system developed by Microsoft, released on July 27, 1993. It marked the company's entry into the corporate computing environment, designed to support large networks and to be ...
, but the NTFS filesystem driver did not include any defragmentation capabilities. In Windows NT 4.0, defragmenting APIs were introduced that third-party tools could use to perform defragmentation tasks; however, no defragmentation software was included. In
Windows 2000 Windows 2000 is a major release of the Windows NT operating system developed by Microsoft, targeting the server and business markets. It is the direct successor to Windows NT 4.0, and was Software release life cycle#Release to manufacturing (RT ...
,
Windows XP Windows XP is a major release of Microsoft's Windows NT operating system. It was released to manufacturing on August 24, 2001, and later to retail on October 25, 2001. It is a direct successor to Windows 2000 for high-end and business users a ...
and
Windows Server 2003 Windows Server 2003, codenamed "Whistler Server", is the sixth major version of the Windows NT operating system produced by Microsoft and the first server version to be released under the Windows Server brand name. It is part of the Windows NT ...
,
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
included a defragmentation tool based on Diskeeper that made use of the defragmentation APIs and was a snap-in for Computer Management. In
Windows Vista Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
,
Windows 7 Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on July 22, 2009, and became generally available on October 22, ...
and
Windows 8 Windows 8 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on August 1, 2012, made available for download via Microsoft ...
, the tool has been greatly improved and was given a new interface with no visual diskmap and is no longer part of Computer Management. There are also a number of free and commercial third-party defragmentation products available for
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
. *
BSD The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
UFS and particularly
FreeBSD FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
uses an internal reallocator that seeks to reduce fragmentation right in the moment when the information is written to disk. This effectively controls system degradation after extended use. *
Btrfs Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or "B.T.R.F.S.") is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (distinct from Linux's LVM), d ...
has online and automatic defragmentation available. *
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
ext2 ext2, or second extended file system, is a file system for the Linux kernel (operating system), kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed ...
,
ext3 ext3, or third extended filesystem, is a journaling file system, journaled file system that is commonly used with the Linux kernel. It used to be the default file system for many popular Linux distributions but generally has been supplanted by ...
, and
ext4 ext4 (fourth extended filesystem) is a journaling file system for Linux, developed as the successor to ext3. ext4 was initially a series of backward-compatible extensions to ext3, many of them originally developed by Cluster File Systems for ...
: Much like UFS, these filesystems employ allocation techniques designed to keep fragmentation under control at all times. As a result, defragmentation is not needed in the vast majority of cases. ext2 uses an offline defragmenter called e2defrag, which does not work with its successor ext3. However, other programs, or filesystem-independent ones such as defragfs, may be used to defragment an ext3 filesystem. ext4 is somewhat
backward compatible In telecommunications and computing, backward compatibility (or backwards compatibility) is a property of an operating system, software, real-world product, or technology that allows for interoperability with an older legacy system, or with inpu ...
with ext3, and thus has generally the same amount of support from defragmentation programs. Currently e4defrag can be used to defragment an ext4 filesystem, including online defragmentation. *
VxFS The VERITAS File System (or VxFS; called JFS and OnlineJFS in HP-UX) is an extent-based file system. It was originally developed by VERITAS Software. Through an OEM agreement, VxFS is used as the primary filesystem of the HP-UX operating sys ...
has the fsadm utility that includes defrag operations. * JFS has the defragfs utility on IBM operating systems. *
HFS Plus HFS Plus or HFS+ (also known as Mac OS Extended or HFS Extended) is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8. ...
introduced in 1998 with Mac OS 8.1 has a number of optimizations to the allocation algorithms in an attempt to defragment files while they are being accessed without a separate defragmenter. There are several restrictions for files to be candidates for 'on-the-fly' defragmentation (including a maximum size 20MB). There is a utility, , by Coriolis Systems available since OS X 10.3. On traditional Mac OS defragmentation can be done by Norton SpeedDisk and TechTool Pro. * WAFL in
NetApp NetApp, Inc. is an American data infrastructure company that provides unified data storage, integrated data services, and cloud operations (CloudOps) solutions to enterprise customers. The company is based in San Jose, California. It has ranked ...
's ONTAP 7.2 operating system has a command called reallocate that is designed to defragment large files. *
XFS XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; a ...
provides an online defragmentation utility called xfs_fsr. * SFS processes the defragmentation feature in almost completely stateless way (apart from the location it is working on), so defragmentation can be stopped and started instantly. * ADFS, the file system used by
RISC OS RISC OS () is an operating system designed to run on ARM architecture, ARM computers. Originally designed in 1987 by Acorn Computers of England, it was made for use in its new line of ARM-based Acorn Archimedes, Archimedes personal computers an ...
and earlier
Acorn Computers Acorn Computers Ltd. was a British computer company established in Cambridge, England in 1978 by Hermann Hauser, Christopher Curry (businessman), Chris Curry and Andy Hopper. The company produced a number of computers during the 1980s with asso ...
, keeps file fragmentation under control without requiring manual defragmentation.


See also

*
Comparison of defragmentation software __NOTOC__ The following is a comparison of notable file system defragmentation software: Notes References {{Reflist, 30em External links The Big Windows 7 Defragmenter TestThe Big Windows XP Defragmenter Test
Defragmentation softw ...
*
Fragmentation (computing) In computer storage, fragmentation is a phenomenon in the computer system which involves the distribution of data in to smaller pieces which storage space, such as computer memory or a hard drive, is used inefficiently, reducing capacity or perfo ...
*
File system fragmentation In computing, file system fragmentation, sometimes called file system aging, is the tendency of a file system to lay out the contents of Computer file, files non-continuously to allow in-place modification of their contents. It is a special case of ...
*
Virtual disk image In computer science, storage virtualization is "the process of presenting a logical view of the physical storage resources to" a host computer system, "treating all storage media (hard disk, optical disk, tape, etc.) in the enterprise as a sing ...
*
Wear leveling Wear leveling (also written as wear levelling) is a technique Wear leveling techniques for flash memory systems. for prolonging the service life of some kinds of erasable computer storage media, such as flash memory, which is used in solid-state d ...
, a similar technique for prolonging flash memory content


Notes


References


Sources

* Norton, Peter (1994) ''Peter Norton's Complete Guide to DOS 6.22'', page 521 – Sams () * Woody Leonhard, Justin Leonhard (2005) ''Windows XP Timesaving Techniques For Dummies, Second Edition'' page 456 – For Dummies (). * Jensen, Craig (1994). ''Fragmentation: The Condition, the Cause, the Cure''. Executive Software International (). * Dave Kleiman, Laura Hunter, Mahesh Satyanarayana, Kimon Andreou, Nancy G Altholz, Lawrence Abrams, Darren Windham, Tony Bradley and Brian Barber (2006) ''Winternals: Defragmentation, Recovery, and Administration Field Guide'' – Syngress () * Robb, Drew (2003) ''Server Disk Management in a Windows Environment'' Chapter 7 – AUERBACH ()


External links


The Big Windows 7 Defragmenter Test
Benchmarks of popular defrag utilities
Microsoft Windows XP defragmentation - How to schedule a weekly defragmentation
https://archive.today/20130209083314/http://windowsitpro.com/article/articleid/77564/jsi-tip-6260-startdefrag-will-schedule-a-windows-2000-defrag-and-it-is-free-for-personal-use.html Microsoft Windows 2000 Professional and Server defragmentation - How to schedule defragmentation]
SST Hard Disk Optimizer

How Linux avoids making files fragmented

How defragmentation was changed for Windows 7

Complete list of Defragmentation Utilities for Windows
{{Webarchive, url=https://web.archive.org/web/20160528170217/http://www.fact-reviews.com/defrag/Defragmentation.aspx , date=2016-05-28

Defragmentation software File system management de:Fragmentierung (Dateisystem)#Defragmentierung in Betriebssystemen