Fact-checked by Grok 2 weeks ago

File system fragmentation

File system fragmentation is the phenomenon where the data blocks of files or directories are stored in non-contiguous sectors on a storage device, resulting in multiple input/output operations to access a single file and thereby reducing overall system performance.^[1] This fragmentation typically develops over time due to dynamic file operations, such as the creation of new files in partially occupied spaces, the growth or shrinkage of existing files through appends or truncations, and the deletion of files which leaves scattered free space holes that subsequent allocations cannot fully consolidate.^[2] Incremental file modifications, including overwrites and simultaneous multi-file writes, further exacerbate the issue by disrupting block locality and scattering logically sequential data.^[3] The performance impacts are particularly pronounced on hard disk drives (HDDs), where non-contiguous blocks force the read/write heads to perform excessive seeks, leading to latency increases and throughput reductions of up to 62 times for aged file systems under realistic workloads like repeated Git repository updates.^[4] On solid-state drives (SSDs), fragmentation has a milder but still significant effect, as it transforms sequential accesses into random ones at the device level, causing read performance drops of 2 to 5 times or even up to 79% in cases of high fragmentation, alongside risks of write amplification that accelerate NAND wear.^[3] Mitigation strategies include defragmentation tools, which rearrange scattered blocks into contiguous regions to restore spatial locality and improve access efficiency, as implemented in file systems like ext4, XFS, Btrfs, and F2FS.^[4] Preventive approaches, such as advanced allocation algorithms in modern storage systems, aim to maintain contiguity from the outset, reducing the need for periodic defragmentation while preserving performance across both HDDs and SSDs.^[3]

Fundamentals

Definition and Basics

File system fragmentation refers to the condition in which a file system's storage space becomes inefficiently utilized because files, free space, or metadata are divided into non-contiguous pieces on the disk, complicating optimal allocation and access.^[1] This occurs when the logical structure of data does not align with the physical layout on storage media, such as hard disk drives or solid-state drives, leading to scattered blocks that must be retrieved from multiple locations.^[3] At its core, a file system manages storage by dividing the disk into fixed-size units known as blocks (or clusters in some systems like FAT), which serve as the basic allocation units for data.^[5] Each block typically ranges from 512 bytes to several kilobytes, depending on the file system design, and files are stored by allocating one or more of these blocks contiguously when possible to enable efficient sequential access.^[6] However, as storage demands evolve, files may end up in non-contiguous blocks, where the data portions are spread across distant sectors. Key structures include the inode, a metadata record that stores file attributes (such as size, permissions, and timestamps) along with pointers to the allocated data blocks; and the directory, a special file that maps human-readable names to inodes, facilitating file organization and lookup.^[5]^[7] To visualize fragmentation, imagine a large file as a puzzle intended to fit neatly into adjacent spaces on the disk; instead, its pieces are scattered like isolated fragments across the surface, requiring the system to jump between remote areas to reassemble the whole—much like retrieving non-adjacent sectors on a platter. These foundational concepts underpin both internal fragmentation, where allocated space within a block goes unused, and external fragmentation, involving scattered free space, as explored in later sections.^[1]

Historical Context

File system fragmentation emerged as a challenge in the early days of digital storage, particularly with the advent of random-access disk systems in the 1950s and 1960s. The IBM 305 RAMAC system, introduced in 1956, was the first to incorporate a commercial hard disk drive (the IBM 350 Disk Storage Unit), using record-oriented access methods on fixed-size tracks that could lead to inefficient retrieval as data volumes increased.^[8]^[9] This lack of structured organization contributed to access delays, laying the groundwork for fragmentation concerns in subsequent systems. IBM's OS/360, released in 1964, further exemplified these issues by relying on fixed block allocation for its direct-access storage devices (DASD), where deleting or resizing files created scattered free space, exacerbating seek inefficiencies on mechanical drives.^[10] The development of Unix in the early 1970s introduced key concepts still central to file systems today. The original Unix file system (FS), implemented around 1971 at Bell Labs, divided disks into fixed-size blocks with inodes for metadata and indirect pointers for large files, but it suffered from fragmentation due to small block sizes (512-1024 bytes) and poor allocation locality. This led to the Berkeley Fast File System (FFS) in 1984, which improved performance through larger blocks, fragments for partial blocks, and cylinder groups to enhance data locality and reduce external fragmentation.^[7] The problem intensified in the 1970s with the rise of personal computing and the introduction of the File Allocation Table (FAT) file system in 1977, developed by Bill Gates and Marc McDonald at Microsoft for floppy disk support in early DOS environments. FAT's cluster-based allocation scheme, which grouped fixed-size blocks into larger units to simplify management, inherently promoted external fragmentation as files expanded or were deleted, leaving non-contiguous gaps across the disk. This design choice, while enabling basic file chaining via a simple index table, amplified performance degradation on emerging hard disk drives (HDDs), where mechanical read/write heads required additional time to jump between scattered clusters. Research during this era, such as the 1972 study by Teorey and Pinkerton on disk scheduling policies, highlighted how such non-sequential access patterns increased average seek times, underscoring fragmentation's role in hindering throughput on movable-head disks.^[11]^[12] By the 1980s, as HDD capacities grew and MS-DOS became widespread, fragmentation became a pressing issue for PC users, prompting the development of early dedicated defragmentation tools to reorganize files into contiguous blocks and mitigate seek overhead on mechanical storage.^[13] Advancements in the 1990s and 2000s sought to inherently reduce fragmentation through more sophisticated allocation strategies. Microsoft's NTFS, launched in 1993 with Windows NT 3.1, improved upon FAT by employing a Master File Table (MFT) for dynamic metadata tracking and better space management, which minimized scattering of file extents compared to cluster chaining. Similarly, the ext4 file system, merged into the Linux kernel in 2008, introduced extents—contiguous block ranges—to replace indirect block pointers, significantly lowering metadata overhead and fragmentation for large files via features like delayed allocation and multi-block allocators. The shift from HDDs to solid-state drives (SSDs) in the 2000s further diminished fragmentation's impact, as SSDs eliminate mechanical seeks, though early file systems still benefited from these optimizations in hybrid environments.^[14]^[15]

Causes

File Operations Leading to Fragmentation

File creation in file systems often leads to fragmentation when new files are allocated blocks from available free space that is not fully contiguous. As files are created, the allocator selects blocks that may be scattered due to prior usage patterns, splitting larger contiguous free areas into smaller, non-adjacent segments. For instance, out-of-order or simultaneous creation of multiple files can interleave their blocks, preventing optimal contiguous placement and increasing the number of non-contiguous extents per file.^[16]^[3] File growth, particularly through appending data to existing files, exacerbates fragmentation when the additional data cannot fit within the originally allocated contiguous blocks. In such cases, the file system allocates new blocks in the nearest available free space, which may be distant from the existing allocation, resulting in the file spanning multiple non-adjacent locations. This is common in workloads involving incremental writes, such as logging or database updates, where small appends over time scatter blocks across the disk. Delays between appends can further worsen this by allowing intervening operations to occupy nearby free space.^[16]^[17] File deletion contributes to fragmentation by freeing up blocks and creating scattered "holes" in the previously occupied space, which fragments the overall free space layout. These holes make it difficult for subsequent allocations to find large contiguous regions, as the remaining free areas become isolated and unevenly distributed. Over time, repeated deletions in mixed workloads lead to a proliferation of small free fragments that cannot accommodate larger files without splitting.^[18]^[17] Consider a step-by-step scenario on a disk initially filled with sequential files A, B, and C, each occupying contiguous blocks (e.g., A in blocks 1-10, B in 11-20, C in 21-30, with 31-50 free). First, creating a new file D of 10 blocks allocates it contiguously in 31-40, maintaining low fragmentation. Next, appending 5 blocks to file A requires space beyond its original allocation; with blocks 11-20 now partially occupied by B, the append goes to 41-45, splitting A into two extents. Then, deleting file B frees blocks 11-20, creating a hole amid A's extents and the remaining space. Finally, growing file C by 15 blocks attempts to use the B hole but, if partially filled by another operation or due to allocation preferences, spills into scattered remnants (e.g., 11-15 and 46-55), resulting in C having three non-contiguous extents and overall disk fragmentation. This progression illustrates how mixed operations transform an ordered layout into a scattered one.^[16]^[18]

Allocation Strategies Contributing to Fragmentation

File systems employ various algorithms to manage free space allocation, which can inadvertently lead to fragmentation by creating scattered or unusable portions of disk space. The first-fit algorithm scans the list of free blocks or extents starting from the beginning and allocates the first sufficiently large contiguous region that meets the request size. This approach is simple and quick but often results in external fragmentation, as it tends to leave small, unusable gaps between allocated regions after repeated allocations and deallocations, particularly when larger requests follow smaller ones.^[19] In contrast, the best-fit algorithm searches the entire free space list to select the smallest available block or extent that accommodates the request, aiming to minimize the leftover space in the allocated unit. While this reduces immediate waste in the chosen block, it frequently exacerbates fragmentation over time by splintering larger free areas into numerous tiny remnants that are too small for future allocations, thus increasing the overall number of fragmented free space segments.^[19] The next-fit variant improves upon first-fit by resuming the search from the point of the previous allocation rather than the start of the list, which reduces search overhead and helps distribute allocations more evenly across the disk. However, like first-fit, it can still produce small unusable gaps, contributing to external fragmentation, especially in workloads with variable-sized requests.^[19] Cluster size choices in systems like the File Allocation Table (FAT) further contribute to internal fragmentation by enforcing allocation in fixed-size units that may exceed the actual data needs of small files. For a file of size f bytes on a system with cluster size c bytes, the internal waste per file is given by c - (f \mod c), representing the unused portion of the last allocated cluster. Larger clusters, often used in FAT to support bigger volumes (e.g., 32 KB for drives over 1 GB), amplify this waste; for instance, a 1 KB file wastes nearly the entire 32 KB cluster, leading to significant storage inefficiency in directories with many small files.^[20] In Unix-like systems, files are represented by inodes that contain direct pointers to data blocks and indirect pointers to blocks of pointers, enabling indexed allocation. When files grow, additional data blocks are allocated from available free space, which may be non-contiguous, leading to scattered block locations across the disk and external fragmentation, as the allocator does not guarantee physical locality.^[21]^[22] These strategies persist due to inherent trade-offs between allocation speed and space efficiency. First-fit and next-fit prioritize rapid allocation by limiting search scope, making them suitable for performance-critical environments despite higher fragmentation risks, whereas best-fit offers better long-term space utilization at the cost of slower full-list scans. Similarly, larger cluster sizes in FAT enhance allocation throughput and reduce metadata overhead for large files but sacrifice efficiency for small ones, balancing overall system responsiveness against storage waste.^[19]

Types

Internal Fragmentation

Internal fragmentation refers to the portion of allocated storage units, such as clusters or blocks, that remains unused because the file's size does not fully occupy the unit. For example, a file of 1 KB stored in a 4 KB cluster wastes 3 KB of space within that cluster.^[23] This inefficiency arises from cluster-based allocation strategies that use fixed-size units to manage disk space efficiently.^[24] In file systems like FAT and exFAT, which employ fixed cluster sizes ranging from 512 bytes to 32 KB based on volume capacity, internal fragmentation is particularly prevalent for small files, as each file requires at least one full cluster regardless of its actual size.^[25] Conversely, systems such as ReiserFS mitigate this issue through tail packing, a technique that stores the trailing portions of files—those smaller than a full block—in unused space within inode or other metadata blocks, thereby minimizing wasted space in dedicated data blocks.^[26] The extent of internal fragmentation for a given file is determined by the difference between the cluster size and the remainder of the file size divided by the cluster size. Specifically, for a file of size f bytes in a cluster of size c bytes, the waste in the last cluster is c - (f \mod c) if f \mod c \neq 0; otherwise, it is zero. The total internal fragmentation across all files is the sum of these individual wastes:

\text{Total internal fragmentation} = \sum_{\text{files}} \left( c - (f \mod c) \right) \quad \text{for } f \mod c \neq 0

This formula highlights how waste accumulates, especially with numerous small files. For instance, with a 4 KB cluster size and files of 1 KB, 2 KB, and 3 KB, the individual wastes are 3 KB, 2 KB, and 1 KB, respectively, yielding a total of 6 KB lost to internal fragmentation.^[27] A key characteristic of internal fragmentation is its cumulative impact on storage capacity over many files, resulting in substantial overall inefficiency without involving data relocation, as the waste is confined within already allocated units.^[28]

External Fragmentation

External fragmentation in file systems refers to the phenomenon where files are stored in non-contiguous blocks across the disk, and available free space becomes divided into small, scattered segments that cannot accommodate new allocations despite sufficient total free space. This scattering increases seek times and I/O operations, particularly on mechanical disks, as the disk head must move between distant locations to access file parts.^[1] Distinct from internal fragmentation, which wastes space within allocated blocks due to block size mismatches, external fragmentation primarily affects the overall layout of data and free areas on the storage medium.^[29] Key subtypes include file fragmentation, where blocks of a single file are split into multiple non-contiguous extents; free space fragmentation, where disjoint free regions create small "holes" too tiny for practical use; and file scattering, where blocks from different files become interleaved or widely separated, exacerbating access inefficiencies.^[1] To quantify external fragmentation, common metrics assess the largest contiguous free space available or the extent count for individual files, indicating how fragmented the layout has become. For instance, a 100 MiB file divided into 200 separate extents—each roughly 512 KiB—can double read times on a disk with 100 MiB/s bandwidth and 5 ms average seek latency, due to the additional 199 seeks required.^[1] In file system examples, older designs like FAT suffer severe external fragmentation because of their linked allocation approach, which chains scattered blocks without ensuring contiguity, leading to inefficient space use and performance degradation.^[30] In contrast, modern systems such as Btrfs address this by employing extents—large, contiguous ranges of blocks—to store file data, which naturally reduces fragmentation by preferring linear allocations and allowing defragmentation to coalesce scattered extents into fewer, larger ones.^[31]

Data Structure Fragmentation

Data structure fragmentation in file systems refers to the non-contiguous or inefficient allocation of metadata structures, such as directories, inodes, and journal logs, which disrupts sequential access and increases overhead during file system operations.^[32] Unlike file data fragmentation, this type targets auxiliary structures that store essential information like file attributes, permissions, and block mappings, leading to scattered metadata blocks that require multiple disk seeks for retrieval.^[33] This phenomenon arises from repeated file creations, deletions, and modifications that disrupt the initial localized layout of these structures.^[34] In the FAT file system, directory fragmentation manifests as entries becoming spread across non-contiguous clusters, since directories are treated as special files with chains managed by the file allocation table.^[35] As files are added or removed, the directory's cluster chain can break into fragments, requiring the system to traverse disjointed blocks to enumerate contents.^[11] Similarly, in the ext4 file system, inode fragmentation occurs when the block pointers within inode structures reference scattered data blocks, complicating metadata access despite efforts to localize inodes within block groups.^[36] This fragmentation hinders operations reliant on metadata locality, such as directory listings and file searches, which may involve numerous random disk accesses instead of sequential reads, thereby elevating latency.^[32] In log-structured file systems like LFS, write amplification exacerbates journal fragmentation, as sequential writes to the log are interspersed with garbage collection that scatters valid metadata updates across segments.^[37] These issues can compound challenges from external fragmentation by further dispersing related metadata away from file data.^[3]

Impacts

Performance Degradation

File system fragmentation, particularly external fragmentation, leads to non-contiguous allocation of file blocks, which significantly increases disk access times on traditional hard disk drives (HDDs).^[3] For HDDs, reading a fragmented file requires multiple seeks by the read/write head to locate scattered blocks across the disk platter, incurring substantial seek times—typically 5-10 milliseconds per seek—and rotational latency as the disk spins to position the desired sector under the head. This mechanical overhead can multiply the time needed for file access; for instance, a file divided into numerous extents may demand dozens or hundreds of such operations, compared to a single contiguous read.^[3] In contrast, solid-state drives (SSDs) eliminate mechanical seeks and rotational delays, resulting in much lower latency for random accesses, though fragmentation still imposes overhead through increased command queuing and internal parallelism disruptions.^[3] This fragmentation-induced scattering amplifies I/O operations, as each non-contiguous extent of a file generates a separate read or write request from the file system to the storage device.^[17] For example, a moderately fragmented file with a degree of fragmentation (DoF) of 256—meaning its blocks are spread into 256 separate extents—can experience up to 4.4 times slower read performance on an NVMe SSD compared to a contiguous version, due to the proliferation of individual I/O commands that overwhelm the device's multi-queue processing.^[3] On HDDs, this I/O amplification exacerbates seek overhead, potentially requiring multiple seeks (one per extent) for a fragmented file, significantly prolonging overall operation times. Historical benchmarks from the 1990s illustrate the scale of these slowdowns on HDD-based systems. In studies of aging Berkeley Fast File System (FFS) partitions under real workloads, performance operated at 85-95% of peak levels after 2-3 years of use, with extreme cases like news spool directories showing up to 30% reductions (to 70%) in write throughput for 64 KB files due to fragmented block allocation.^[38] More recent evaluations confirm ongoing relevance, with fragmented workloads on modern SSDs exhibiting 40% slowdowns in database operations like SQLite queries, where a 162 MB database split into over 10,000 pieces took substantially longer to process.^[3] In contemporary SSD environments, the absence of mechanical components mitigates much of the seek-related penalty, rendering fragmentation's impact far less severe than on HDDs—often limited to 2-5 times slowdowns in worst-case sequential reads rather than orders-of-magnitude delays.^[3] However, persistent effects include disrupted internal parallelism, where fragmented accesses cause die-level collisions that reduce effective throughput, and indirect strain on wear leveling algorithms through scattered write patterns that accelerate uneven flash cell erasure.^[17]

Storage Inefficiency

File system fragmentation results in storage inefficiency primarily through the creation of unusable space fragments that diminish the effective capacity available for data storage. External fragmentation scatters free space into numerous small holes, many of which are too small to accommodate new files or meet minimum allocation unit sizes, leaving them effectively wasted. Internal fragmentation adds to this by generating slack space within allocated blocks that remain partially unused, particularly for files smaller than the block size. Together, these mechanisms reduce the overall usable storage, with studies indicating that fragmented file systems can retain small but notable portions of unused space, such as approximately 2% in ext3 under simulated heavy usage.^[39] The compounding effects of internal and external fragmentation further exacerbate storage waste, as both types contribute independently to lost capacity. Internal waste arises from the difference between allocated block sizes and actual file needs, while external waste stems from fragmented free areas that cannot be coalesced. The total usable space can thus be conceptualized as the disk's total capacity minus the sum of internal slack and external unusable holes; for instance, in a system with 4 KB clusters storing many small files, average slack per file might approach half the cluster size (2 KB), while external holes could occupy an additional portion of free space, leading to cumulative reductions in available storage. This additive nature means that addressing one type alone does not fully mitigate the inefficiency.^[40]^[41] Over time, repeated file operations—such as creations, deletions, and modifications—cause fragmentation to accumulate, progressively eroding available contiguous space and intensifying storage inefficiency. As operations continue, free space becomes increasingly divided into smaller segments, making it harder to allocate blocks for larger files and forcing more reliance on scattered locations, which indirectly wastes capacity through unallocatable remnants. Simulations of prolonged heavy usage demonstrate this buildup, with fragmentation levels rising steadily and leading to sustained reductions in efficient space utilization.^[39] The degree of storage inefficiency varies by file system design, with linear allocation schemes like FAT exhibiting higher vulnerability than tree-based systems like NTFS. FAT's sequential chaining approach readily results in scattered allocations and fragmented free space after multiple operations, amplifying waste from both internal slack and external holes. NTFS, by contrast, employs advanced structures like the master file table and extent allocation to optimize placement and minimize scattering, thereby preserving greater usable capacity even under load. This design difference makes FAT more inefficient in long-term scenarios with diverse file activities.^[42]^[43]

Mitigation Strategies

Prevention Techniques

Prevention techniques for file system fragmentation focus on proactive measures in allocation algorithms, file system architectures, operational habits, and tunable parameters to maintain contiguous storage and minimize waste. Allocation policies such as the buddy system divide free space into power-of-two sized blocks, enabling efficient coalescing of adjacent free regions to preserve larger contiguous areas and reduce external fragmentation. This approach, originally developed for memory management but adapted for disk allocation, limits fragmentation by restricting allocations to buddy-compatible sizes, ensuring that deallocated blocks can merge seamlessly without leaving small, unusable gaps.^[44] Similarly, best-fit allocation strategies select the smallest suitable free block for a request, which helps counter poor-fit algorithms that scatter files and promotes denser packing to avoid creating isolated fragments. Certain file system designs incorporate mechanisms to allocate space in advance or handle modifications without immediate disruption. In NTFS, pre-allocation via APIs like SetEndOfFile reserves contiguous disk space for a file before data is written, preventing fragmentation during growth by ensuring extensions occur in adjacent clusters rather than scattered locations.^[45] NTFS also protects the Master File Table (MFT) by reserving 12.5% of the volume exclusively for it, expanding this buffer dynamically to avoid MFT fragmentation as the file system fills.^[46] Copy-on-write (CoW) mechanisms in file systems like Btrfs and ZFS write modifications to new locations instead of overwriting in place, which avoids splitting existing contiguous extents during updates but requires careful management to limit cumulative fragmentation from repeated redirects.^[31] Usage best practices emphasize patterns that limit disruptive operations. Avoiding frequent creation and deletion of small files reduces the accumulation of metadata overhead and scattered allocations, as each operation can introduce tiny fragments that hinder contiguous space availability. On SSDs, mounting ext4 with the noatime option disables access-time updates, minimizing unnecessary metadata writes that could lead to fragmentation by reducing the frequency of small, random I/O patterns.^[47] Configuration options allow tuning to balance fragmentation types. Increasing cluster size in NTFS from the default 4 KB to 64 KB for volumes dominated by large files decreases the number of clusters needed per file, lowering external fragmentation at the cost of potential internal waste in smaller files, but overall reducing allocation overhead. Enabling extent-based allocation in file systems like ext4 groups contiguous blocks into single extents, which minimizes both internal and external fragmentation by representing large files with fewer metadata entries and promoting sequential placement.

Defragmentation Processes

Defragmentation processes involve reorganizing the physical layout of files on storage devices to consolidate fragmented data into contiguous blocks, thereby restoring performance degraded by scattering. These methods typically scan the disk to identify fragmentation, then relocate file extents or clusters while managing free space to minimize future issues. Traditional approaches focus on hard disk drives (HDDs), where seek times benefit from contiguity, but adaptations exist for solid-state drives (SSDs) emphasizing optimization over relocation.^[48] Offline defragmentation requires exclusive access to the volume, often necessitating unmounting or booting into a maintenance mode to perform a comprehensive scan and relocation without interference from active processes. In Windows, the built-in defragmentation tool, invoked via the defrag command or Optimize Drives interface, begins by analyzing the volume bitmap to identify free clusters and retrieval pointers for fragmented files. It then uses file system control codes like FSCTL_MOVE_FILE to relocate clusters to contiguous areas, updating metadata iteratively to account for concurrent changes, though this process can take hours on large drives and requires administrative privileges. This method achieves near-complete consolidation but incurs significant downtime, making it suitable for scheduled maintenance on non-critical systems.^[48]^[49] Online defragmentation operates in the background on mounted and active file systems, allowing continuous user access while gradually relocating data to reduce disruption. For Linux ext4 file systems, the e4defrag utility performs this by copying extents from the page cache to new contiguous blocks, targeting regular files and directories up to a maximum extent size of 131072 KB (on 4 KB blocks), without unmounting the device. Similarly, macOS with HFS+ incorporates automated background defragmentation since OS X 10.2, which proactively reorganizes file data, metadata, and free space during idle periods to maintain efficiency without manual intervention; APFS provides more limited automatic defragmentation for small files under 20 MB with more than 8 fragments.^[50]^[51] For XFS, the xfs_fsr tool supports online defragmentation by relocating files to contiguous space while the file system is mounted. Btrfs includes a btrfs filesystem defragment command for online reorganization of files, with options to target specific paths or the entire subvolume. F2FS offers defragmentation via fstrim integration and gc_urgent mode to optimize garbage collection and reduce fragmentation during idle times. These processes may temporarily slow system responsiveness due to I/O contention but avoid full outages.^[52]^[53]^[54] Core algorithms in defragmentation tools emphasize consolidation through targeted relocation, often sorting files by access frequency or size to prioritize high-impact items and pack them densely. For instance, advanced tools like Microsoft's defragmenter or third-party solutions iterate over retrieval pointers to map and move clusters, while modern approaches such as FragPicker analyze application I/O patterns to selectively defragment only performance-critical data, reducing overall I/O by up to 66% compared to exhaustive scans. Consolidation steps typically involve identifying free space gaps, shifting fragments sequentially, and compacting remaining voids to prevent re-fragmentation.^[48]^[55] Specialized tools extend these capabilities, such as PerfectDisk, which employs SMARTPlacement optimization to rearrange files based on usage patterns during a single-pass defragmentation, supporting both offline and online modes on Windows volumes including RAID arrays. For metadata, utilities like e2fsck (the ext2/ext3/ext4 file system checker) include an -D option to optimize directories by reindexing, sorting entries, or compressing them, which indirectly defragments metadata structures like hash trees for faster lookups. However, challenges persist: open files with active locks cannot always be relocated without closing handles, leading to partial defragmentation; on SSDs, traditional relocation is discouraged due to write amplification, with tools instead integrating TRIM commands to mark unused blocks for garbage collection, preserving endurance without physical movement.^[56]^[57]^[58]

References

[1]
[PDF] How to Fragment Your File System | USENIX
We show that fragmen- tation causes performance declines on both hard drives and SSDs, when there is plentiful cache available, and even on large disks with ...Missing: definition scholarly
[2]
A contemporary investigation of NTFS file fragmentation
file fragmentation occurs when the file system does not write a file contiguously. File fragmentation can happen when new files are created or existing files ...
[3]
[PDF] We Ain't Afraid of No File Fragmentation: Causes and Prevention of ...
Feb 29, 2024 · Most researchers and SSD manufacturers initially claimed that SSD performance is not affected by file fragmentation, and that defragmentation is ...
[4]
[PDF] File System Aging - arXiv
Jan 16, 2024 · Fragmentation occurs when logically contiguous file blocks—either from a large file or from small files in the same directory—become scattered ...
[5]
[PDF] A Fast File System for UNIX - UT Computer Science
MCKUSICK, WILLIAM N. ... Both shared and exclusive locks. ACM Transactions on Computer Systems, Vol. 2, No. 3, August 1984. Page 15. A Fast File System for UNIX.
[6]
A fast file system for UNIX | ACM Transactions on Computer Systems
A fast file system for UNIX. Authors: Marshall K. McKusick. Marshall K ... View or Download as a PDF file. PDF. eReader. View online with eReader . eReader ...
[7]
[PDF] a brief history of the BSD Fast File System | USENIX
May 27, 2007 · In file systems, metadata (e.g., directories, inodes, and free block maps) gives structure to raw storage capacity. Metadata provides pointers ...
[8]
The Evolution of File Systems - Paul Krzyzanowski
Aug 26, 2025 · Sequential and Flat File Systems (1950s-1960s) Tape-based systems used various magnetic tape formats: IBM 729 tape drives (1952): 7-track tapes ...Missing: OS/ | Show results with:OS/
[9]
IBM System/360 - CHM Revolution - Computer History Museum
Its product line was fragmented with incompatible machines, poorly suited to offer companies a single, unified, easily expandable system. IBM's System/360, a ...
[10]
The FAT (File Allocation Table) Explained - DOS Days
Its use of clusters led to wasted disk space, its allocation of the disk led to fragmentation over time and thus loss of performance, no support for filenames ...
[11]
A comparative analysis of disk scheduling policies
A comparative analysis of disk scheduling policies. Authors: Toby J. Teorey. Toby J. Teorey. Univ. of Wisconsin, Madison. View Profile. , Tad B. Pinkerton.
[12]
Was Norton utilities the first disk defragmenter?
Oct 25, 2024 · This is all 1970s stuff. Norton Utilities Speed Disk was introduced in version 4.0, released in 1987. It was not the first disk defragmenter ...Did any PC disk defragmentor optimize cylinder position?Were later MS-DOS versions still implemented in x86 assembly?More results from retrocomputing.stackexchange.comMissing: 1980s | Show results with:1980s
[13]
How NTFS Works and Why We Still Use It - BackupChain
Introduced in 1993, NTFS has been the primary file system for Windows ever since. It was designed to overcome the limitations of its predecessors ...
[14]
[PDF] Ext4 block and inode allocator improvements
Jul 26, 2008 · Extents help improve the performance of sequential file read/writes since extents are a significantly smaller amount of meta- data to be written ...
[15]
[PDF] File Systems Fated for Senescence? Nonsense, Says Science!
By delaying allocation when a file is growing, the file system can allocate a larger extent for data appended to the same file. However, allocations can ...
[16]
Filesystem Fragmentation on Modern Storage Systems
Filesystem fragmentation is when related file blocks are scattered into multiple non-contiguous blocks, degrading I/O performance on modern storage systems.Missing: definition | Show results with:definition
[17]
[PDF] A Comparison of FFS Disk Allocation Policies - USENIX
In this paper we study the effectiveness of this algorithm at reducing file system fragmentation. We have cre- ated a program that artificially ages a file ...
[18]
[PDF] Free-Space Management - cs.wisc.edu
Best Fit. The best fit strategy is quite simple: first, search through the free list and find chunks of free memory that are as big or bigger than the ...<|separator|>
[19]
14.3. Disk Drives — OpenDSA Data Structures and Algorithms ...
A second example of internal fragmentation occurs at cluster boundaries. Files whose size is not an even multiple of the cluster size must waste some space at ...Missing: formula | Show results with:formula
[20]
File Allocation Methods - GeeksforGeeks
Sep 12, 2025 · The allocation methods define how the files are stored in the disk blocks. There are three main disk space or file allocation methods.
[21]
[PDF] File System Implementation - CS@Cornell
It keeps the advantages of the Linked Allocation (no external fragmentation, flexible size-declaration). It supports efficient direct access. It suffers from ...
[22]
[PDF] File System Implementation - cs.wisc.edu
ASIDE: FREE SPACE MANAGEMENT. There are many ways to manage free space; bitmaps are just one way. Some early file systems used free lists, where a single ...
[23]
[PDF] File Systems
Fragmentation. Internal Fragmentation. • allocated file size (in blocks) may be larger than requested file size (in bytes); this size difference is wasted ...
[24]
Lecture 11: File Systems
FAT still has internal fragmentation - a process does not use all of the data in a given sector and it cannot be used by other processes (waste of data).Missing: calculation | Show results with:calculation
[25]
[PDF] FILE SYSTEMS
• Causes problem 2: FAT file systems have severe internal fragmentation issues storing small files on large devices. • Clusters can be as large as 32KiB or ...<|separator|>
[26]
Lecture 5: Unix File System Review - UCSD CSE
Internal fragmentation results as a result of allocating storage in whole block units -- even when less storage is requested. But, much as was the case with ...
[27]
Jounal File Systems - Unix/Linux Systems Programming
Internal fragmentation The logical block is the minimum allocation unit presented by the file system through the system calls. That means that, storing fewer ...Missing: definition | Show results with:definition
[28]
Lecture 27, Dynamic Storage Allocation - University of Iowa
Because this fragment is inside an allocated block, we refer to this problem as internal fragmentation of the available free space. In general, any storage ...
[29]
12.2: Files - Engineering LibreTexts
Apr 19, 2021 · External fragmentation will occur, making it difficult to find contiguous blocks of space of sufficient length. Compaction algorithm will be ...
[30]
[PDF] Chapter 12: File System Implementation
Linked Allocation. Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. Simple: store starting block index in file meta-data ...
[31]
Defragmentation - BTRFS documentation!
Defragmentation of files is supposed to make the layout of the file extents to be more linear or at least coalesce the file extents into larger ones.
[32]
Storage Focus: File System Fragmentation
May 21, 2004 · In the simplest terms, I would define file system fragmentation as any access to any data that is expected to be sequential on the device but is ...
[33]
[PDF] Metadata Update Performance in File Systems
The blocks that are read from and written to disk often contain multiple meta- data structures (e.g., inodes or directory fragments), each of which generally ...
[34]
[PDF] A Five-Year Study of File-System Metadata
This will save substantial space by eliminating internal fragmentation, especially if a large block size is used to improve performance.
[35]
FAT Filesystem - Elm-chan.org
The count of clusters, number of clusters that can exist in the data area, is the quotient of size of the data area divided by cluster size. The remainder is ...
[36]
2. High Level Design - The Linux Kernel documentation
An ext4 file system is split into a series of block groups. To reduce performance difficulties due to fragmentation, the block allocator tries very hard to ...
[37]
The design and implementation of a log-structured file system
A log-structured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery.
[38]
[PDF] File System Logging Versus Clustering: A Performance Comparison
We find that active FFS file systems function at approximately 85-95% of their maximum performance after two to three years.Missing: slowdown 1990s
[39]
[PDF] The Effects of Filesystem Fragmentation
Jul 19, 2006 · To measure the actual effects of the fragmen- tation level of a filesystem, we simulate heavy usage over a longer period of time on a con-.<|control11|><|separator|>
[40]
External Fragmentation in OS - GeeksforGeeks
Sep 5, 2025 · External fragmentation is a problem in memory management where free memory is divided into small, non-contiguous blocks.
[41]
Introducing computing and IT: 5.5 Fragmentation | OpenLearn
So the file will need 18 whole clusters. The physical size of the file is 18 × 4 KB = 72 KB. Hence the slack space in this case is 72 KB – 69 KB = 3KB.
[42]
Overview of FAT, HPFS, and NTFS File Systems - Windows Client
Jan 15, 2025 · This article explains the differences between File Allocation Table (FAT), High Performance File System (HPFS), and NT File System (NTFS) under Windows NT, and ...Missing: fragmentation | Show results with:fragmentation
[43]
Defragmenting after Fresh Re-Install - Microsoft Q&A
Sep 18, 2010 · Frequent defragging was mandatory. With the arrival of NTFS, the file, however, the file system became a lot more resistant to fragmentation.
[44]
[PDF] LLFree: Scalable and Optionally-Persistent Page-Frame Allocation
Linux Buddy Allocator A buddy allocator avoids external fragmentation by allowing only allocation sizes of the form 2o ×P, where P is the smallest size and o ...
[45]
Disk file allocation based on the buddy system - ACM Digital Library
Although allocating a file on multiple extents nearly eliminates the buddy system's fragmentation problems, it does not reduce its speed and simplicity ...Missing: adjust | Show results with:adjust
[46]
[PDF] closing the gap between leaky confidential VMs and bare-metal cloud
Jul 10, 2023 · We experimented with a variety of memory allocators, and found unsurprisingly that a best-fit policy (which places each new slice in the ...
[47]
Avoid fragmentation while writing on a sparse file by allocating ...
Aug 7, 2023 · This module explores file system fragmentation and the tools that you can use to reduce fragmentation. Students will learn how Windows can ...Manage NTFS File Write Positioning - Microsoft Q&AQuestion about NTFS Compression - Microsoft Q&AMore results from learn.microsoft.comMissing: mechanism | Show results with:mechanism
[48]
How NTFS reserves space for MFT - Windows Server - Microsoft Learn
Jan 15, 2025 · NTFS reserves 12.5 percent of the volume for exclusive use of the MFT until and unless the remainder of the volume is used up.
[49]
SSDOptimization - Debian Wiki
May 29, 2023 · Add the "noatime" (or the default "relatime") mount option in /etc/fstab, to disable (or significantly reduce) disk writes whenever a file is ...
[50]
Defragmenting Files - Win32 apps - Microsoft Learn
Sep 10, 2024 · When using CreateFile to open a directory during defragmentation of a FAT or FAT32 file system volume, specify the GENERIC_READ access mask ...
[51]
defrag | Microsoft Learn
Nov 1, 2024 · Reference article for the defrag command, which locates and consolidates fragmented files on local volumes to improve system performance.
[52]
e4defrag(8) - Linux manual page - man7.org
e4defrag reduces fragmentation of extent based file. The file targeted by e4defrag is created on ext4 file system made with "-O extent" option (see mke2fs(8)).Missing: process | Show results with:process
[53]
Explainer: defragmentation - The Eclectic Light Company
Nov 20, 2021 · APFS leads to data and file system fragmentation. When used on a hard disk, those can cause severe performance problems, particularly the latter ...
[54]
PerfectDisk - Download
May 23, 2023 · PerfectDisk's patented Advanced SMARTPlacement(TM) optimization is paired with its exclusive single-pass defragmentation and free space ...
[55]
e2fsck(8) - Linux manual page
### Summary of e2fsck Content on Optimizing or Defragmenting Metadata or Directories
[56]
Should you defrag an SSD?
### Summary on SSD Defragmentation, TRIM Integration, and Why Traditional Defrag is Not Recommended