File system
A file system, often abbreviated as FS, is a fundamental component of an operating system responsible for organizing, storing, retrieving, and managing data on storage devices such as hard disk drives, solid-state drives, or optical media.[1] It provides a structured abstraction layer that allows users and applications to interact with files and directories without directly handling the physical storage details, including allocation of space and maintenance of metadata like file names, sizes, permissions, and timestamps.[2] File systems typically employ a hierarchical directory structure to mimic familiar folder organization, enabling efficient navigation and access control through mechanisms like user permissions and access control lists (ACLs).[3] At their core, they consist of layered architectures: the physical file system layer handles low-level interactions with hardware, such as block allocation on disks; the logical file system manages metadata and file operations like creation, deletion, and searching; and the virtual file system (VFS) acts as an interface to support multiple file system types seamlessly within the same OS.[4] Key operations include reading, writing, opening, and closing files, often supported by APIs that ensure atomicity and consistency, particularly in multi-user environments.[2] The evolution of file systems dates back to the early days of computing, with early systems in the 1950s and 1960s relying on sequential tape storage, progressing to hierarchical structures first introduced in Multics in the late 1960s and refined in Unix during the 1970s, and advancing to modern journaling and copy-on-write mechanisms in the 1990s and beyond to enhance reliability and performance.[5][6] Notable types include FAT (File Allocation Table), an early, simple system for cross-platform compatibility but limited by file size constraints; NTFS (New Technology File System), the default for Windows since 1993, offering features like encryption, compression, and crash recovery; exFAT for flash drives supporting large files; ext4, a robust journaling system for Linux; and APFS for Apple devices, optimized for SSDs with built-in encryption and snapshots.[7][8] These variations address specific needs, such as scalability for enterprise storage or efficiency for mobile devices, while common challenges include fragmentation, security vulnerabilities, and adapting to emerging hardware like NVMe drives.[9]Fundamentals
Definition and Purpose
A file system is an abstraction layer in an operating system that organizes, stores, and retrieves data on persistent storage media such as hard drives or solid-state drives, treating files as named, logical collections of related data bytes.[10][11] This abstraction hides the complexities of physical storage, such as disk sectors and blocks, from applications and users, presenting data instead as structured entities that can be easily accessed and manipulated.[12] File systems are typically agnostic to the specific contents of files, allowing them to handle diverse data types without interpreting the information itself.[11] The primary purpose of a file system is to enable reliable, long-term persistence of data beyond program execution or system restarts, while supporting efficient organization and access for both users and applications.[13] It facilitates hierarchical structuring of files through directories, tracks essential metadata such as file size, creation timestamps, ownership, and permissions, and manages space allocation to prevent data corruption or loss.[14] By providing these features, file systems bridge low-level hardware operations—like reading or writing fixed-size blocks on a disk—with high-level software needs, such as sequential or random access to variable-length streams.[15] Key concepts in file systems distinguish between files, which serve as containers for raw data, and directories, which act as organizational units grouping files and subdirectories into navigable structures.[16] Metadata, stored separately from the file contents, includes attributes like identifiers, locations on storage, protection controls, and usage timestamps, enabling secure and trackable operations.[14] For instance, file systems abstract the linear arrangement of disk sectors into logical views, such as tree-like hierarchies for directories or linear streams for file contents, simplifying data management across diverse hardware.[17][12]Historical Development
The development of file systems began in the 1950s with early computing systems relying on punch cards and magnetic tapes for data storage. Punch cards served as a sequential medium for input and storage in machines like the IBM 701, introduced in 1952, but magnetic tape emerged as a key advancement. The IBM 726 tape drive, paired with the 701 in 1953, provided the first commercial magnetic tape storage for computers, capable of holding 2 million digits on a single reel at speeds of 70 inches per second. These systems treated files as sequential records without hierarchical organization, limiting access to linear reads and writes.[18][19] By the 1960s, the shift to disk-based storage marked a significant evolution, enabling random access and more efficient file management. IBM's OS/360, released in 1966 for the System/360 mainframe family, introduced direct access storage devices (DASD) like the IBM 2311 disk drive from 1964, which supported removable disk packs with capacities up to 7.25 MB. This allowed for the first widespread use of disk file systems in batch processing environments, organizing data into datasets accessible via indexed sequential methods, though still largely flat in structure.[20][21] The 1970s and 1980s brought innovations in hierarchical organization and user interfaces. The Unix file system, developed at Bell Labs in the early 1970s and first released in 1971, popularized a tree-like directory structure with nested subdirectories, inspired by Multics, enabling efficient file organization and permissions.[22] The File Allocation Table (FAT), created by Microsoft in 1977 for standalone Disk BASIC and adopted in MS-DOS by 1981, provided a simple bitmap-based allocation for floppy and hard disks, supporting basic directory hierarchies but limited by 8.3 filename constraints. Meanwhile, the Xerox Alto, unveiled in 1973, introduced graphical user interface (GUI) elements for file management through its Neptune file browser, allowing icon-based manipulation on a bitmapped display, influencing future personal computing designs.[23][24] In the 1990s and 2000s, file systems emphasized reliability through journaling and advanced features. Microsoft's NTFS, launched in 1993 with Windows NT 3.1, incorporated journaling to log metadata changes for crash recovery, alongside support for large volumes, encryption, and access control lists.[25] Linux's ext2, introduced in 1993 by Rémy Card and others, offered a robust inode-based structure succeeding the original ext, while ext3 in 2001 added journaling for faster recovery. Sun Microsystems' ZFS, announced in 2005, advanced data integrity with end-to-end checksums, copy-on-write mechanisms, and built-in volume management to detect and repair silent corruption.[26][27] The 2010s and 2020s saw adaptations for modern hardware, mobile devices, and distributed environments. Apple's APFS, released in 2017 with macOS High Sierra, optimized for SSDs with features like snapshots, cloning, and space sharing across volumes for enhanced performance on iOS and macOS devices. Btrfs, initiated by Chris Mason in 2007 and merged into the Linux kernel in 2009, introduced copy-on-write for snapshots and subvolumes, improving scalability and data integrity in Linux distributions. Distributed systems gained prominence with Ceph, originating from a 2006 OSDI paper and first released in 2007, providing scalable object storage with dynamic metadata distribution for cluster environments. Amazon S3, launched in 2006 as an object store, evolved in the 2020s with file system abstractions like S3 File Gateway and integrations for POSIX-like access, enabling cloud-native scalability for massive datasets in AI and big data applications.[28][29][30] Key innovations across this history include the transition from flat, sequential structures to hierarchical directories for better organization; the adoption of journaling in systems like NTFS, ext3, and ZFS to ensure crash recovery without full scans; and the integration of distributed and cloud paradigms in Ceph and S3 abstractions, addressing scalability for virtualization and AI workloads post-2020.[22][27][30]Architecture
Core Components
The architecture of many file systems, particularly block-based ones inspired by the Unix model such as ext4, includes core components that form the foundational structure for organizing and managing data on storage media. Variations exist in other file systems, such as NTFS or FAT, which use different structures like the Master File Table or File Allocation Table (detailed in the Types section). The superblock serves as the primary global metadata structure, containing essential parameters such as the total number of data blocks, block size, and file system state, which enable the operating system to interpret and access the file system layout.[31] In Unix-like systems, the superblock is typically located at a fixed offset on the device and includes counts of free blocks and inodes to facilitate space management.[32] The inode table consists of per-file metadata entries, each inode holding pointers to data blocks along with attributes like file size and ownership, allowing efficient mapping of logical file contents to physical storage locations.[31] Data blocks, in contrast, store the actual content of files, allocated in fixed-size units to balance performance and overhead on the underlying hardware.[31] These components interact through layered abstractions: device drivers provide low-level hardware access by handling I/O operations on physical devices like disks, while the file system driver translates logical block addresses to physical ones, ensuring data integrity during reads and writes.[33] In operating systems like Unix and Linux, the Virtual File System (VFS) layer acts as an abstraction interface, standardizing access to diverse file systems by intercepting system calls and routing them to the appropriate file system driver, thus enabling seamless integration of multiple file system types within a unified namespace.[34] Key processes underpin these interactions; mounting attaches the file system to the OS namespace by reading the superblock, validating the structure, and establishing the root directory in the global hierarchy, making its contents accessible to processes.[35] Unmounting reverses this by flushing pending writes, releasing resources, and detaching the file system to prevent data corruption during device removal or shutdown.[36] Formatting initializes the storage media by writing the superblock, allocating the inode table, and setting up initial data structures, preparing the device for use without existing data.[31] Supporting data structures include block allocation tables, often implemented as bitmaps to track free and allocated space across data blocks, enabling quick identification of available storage during file creation or extension.[32] Directory entries link human-readable file names to inode numbers, forming the basis for path resolution and navigation within the file system hierarchy.[37] Together, these elements ensure reliable data organization and access, with the superblock providing oversight, inodes and data blocks handling individual files, and abstraction layers bridging hardware and software.Metadata and File Attributes
In file systems, metadata refers to data that describes the properties and characteristics of files, distinct from the actual file content. This information enables the operating system to manage, access, and protect files efficiently. Metadata storage varies by file system type; for example, Unix-like systems store it separately from the file's data blocks in dedicated structures like inodes, while others like NTFS integrate it into file records within a central table.[38][39][40] Core file attributes form the foundational metadata and include essential details for file identification and operation. These encompass the file name (though often handled via directory entries), size in bytes, timestamps for creation (birth time, where supported), last modification (mtime), and last access (atime), as well as file type indicators such as regular files, directories, symbolic links, or special files like devices. Permissions are also core, specifying read, write, and execute access for the owner, group, and others, encoded in a mode field.[41][42][40] Extended attributes provide additional, flexible metadata beyond core properties, allowing for user-defined or system-specific information. Common examples include ownership details via user ID (UID) and group ID (GID), MIME types for content identification, and custom tags such as access control lists (ACLs) in modern systems like Linux. These are stored as name-value pairs and can be manipulated via system calls like setxattr.[43][42] Metadata storage often relies on fixed-size structures to ensure consistent access times and minimize fragmentation. In Unix-derived file systems, inodes serve as these structures, containing pointers to data blocks alongside attributes; for instance, the ext4 file system uses 256-byte inode records by default, with extra space allocated for extended attributes (up to 32 bytes for i_extra_isize as of Linux kernel 5.2). This design incurs overhead, as each file requires its own inode, potentially consuming significant space in directories with many small files—e.g., ext4's default allocates one inode per 16 KiB of filesystem space.[42][38]Organization and Storage
Directories and Hierarchies
In file systems, directories function as special files that serve as containers for organizing other files and subdirectories. Each directory maintains a list of entries, typically consisting of pairs that associate a file or subdirectory name with its corresponding inode—a data structure holding metadata such as permissions, timestamps, and pointers to data blocks. This design allows directories to act as navigational aids, enabling efficient lookup and access without storing the actual file contents. The root directory, often denoted by a forward slash (/), marks the apex of the hierarchy and contains initial subdirectories like those for system binaries or user home folders in Unix-like systems.[38][44] The hierarchical model structures directories and files into an inverted tree, where the root directory branches into parent-child relationships, with each subdirectory potentially spawning further levels. This organization promotes logical grouping, such as separating user data from system files, and supports scalability for managing vast numbers of items. Navigation within this tree relies on paths: absolute paths specify locations from the root (e.g., /home/user/documents), providing unambiguous references, while relative paths describe positions from the current working directory (e.g., ../docs), reducing redundancy in commands and scripts. This model originated in early Unix designs and remains foundational in modern operating systems for its balance of simplicity and extensibility.[45] Key operations on directories include creation via the mkdir system call, which allocates a new inode and initializes an empty entry list with specified permissions; deletion through rmdir, which removes an empty directory by freeing its inode only if no entries remain; and renaming with rename, which updates the name in the parent directory's entry table while preserving the inode. Traversal operations, essential for searching or listing contents, often employ depth-first search (DFS) to explore branches recursively—as in the find utility—or breadth-first search (BFS) for level-by-level scanning, as seen in tree-like listings from ls -R, optimizing for memory use in deep versus wide structures. These operations ensure atomicity where possible, preventing partial states during concurrent access.[46][47] Variations in hierarchy depth range from flat structures, where all files reside in a single directory without nesting, to deep hierarchies with multiple levels for fine-grained organization; flat models suit resource-constrained environments like embedded systems by minimizing overhead, but hierarchical ones excel in large-scale storage by easing management and reducing name collisions. To accommodate non-tree references, hard links create additional directory entries pointing to the same inode, allowing multiple paths to one file within the same file system, while symbolic links store a path string to another file or directory, enabling cross-file-system references but risking dangling links if the target moves. These mechanisms enhance flexibility without altering the core tree topology.[48][49]File Names and Paths
File names in file systems follow specific conventions to ensure uniqueness and proper navigation within the directory hierarchy. In POSIX-compliant systems, such as Unix-like operating systems, a file name is a sequence of characters that identifies a file or directory, excluding the forward slash (/) which serves as the path separator, and the null character (NUL, ASCII 0), which is not permitted.[50] Filenames may include alphanumeric characters (A-Z, a-z, 0-9), punctuation, spaces, and other printable characters, with a maximum length of {NAME_MAX} bytes, which is at least 14 but commonly 255 in modern implementations like ext4.[51] For portability across POSIX systems, filenames should ideally use only the portable character set: A-Z, a-z, 0-9, period (.), underscore (_), and hyphen (-).[50] In contrast, Windows file systems, such as NTFS, allow characters from the current code page (typically ANSI or UTF-16), but prohibit the following reserved characters: backslash (), forward slash (/), colon (:), asterisk (*), question mark (?), double quote ("), less than (<), greater than (>), and vertical bar (|).[52] Additionally, Windows reserves certain names like CON, PRN, AUX, NUL, COM0 through COM9, and LPT0 through LPT9, which cannot be used for files or directories regardless of extension, due to their association with legacy device names.[52] Case sensitivity varies significantly across file systems, impacting how names are interpreted and stored. POSIX file systems, including ext2/ext3/ext4 on Linux, are case-sensitive, meaning "file.txt" and "File.txt" are treated as distinct files.[53] This allows for greater namespace density but requires careful attention to capitalization. Windows NTFS is case-preserving but case-insensitive by default, storing the original case while treating "file.txt" and "File.txt" as identical during lookups, though applications can enable case-sensitive behavior via configuration.[52] Early file systems like FAT, used in MS-DOS and early Windows, enforced an 8.3 naming convention: up to 8 characters for the base name (uppercase only, alphanumeric plus some symbols) followed by a period and up to 3 characters for the extension, with no support for long names or lowercase preservation initially.[54] Paths construct hierarchical references to files by combining directory names and separators. In Unix-like systems, absolute paths begin from the root directory with a leading slash (/), as in "/home/user/document.txt", providing a complete location independent of the current working directory. Relative paths omit the leading slash and are resolved from the current directory, using "." to denote the current directory and ".." to reference the parent directory; for example, "../docs/report.pdf" navigates up one level then into a subdirectory. The maximum path length in POSIX is {PATH_MAX} bytes, at least 256 but often 4096 in Linux implementations, including the null terminator.[51] Windows paths use a drive letter followed by a colon and backslash (e.g., "C:\Users\user\file.txt" for absolute paths), with relative paths similar to Unix but using backslashes as separators; the default maximum path length is 260 characters (MAX_PATH), though newer versions support up to 32,767 via extended syntax.[52] Portability issues arise from these differences, complicating data exchange across systems. For instance, the 8.3 format in FAT limits names to short, uppercase forms, truncating or aliasing longer names, which can lead to collisions when transferring files to modern systems.[54] Unicode support enhances internationalization; ext4 in Linux stores filenames as UTF-8 encoded strings, allowing non-ASCII characters like accented letters or scripts such as Chinese, provided the locale supports UTF-8.[55] Windows NTFS uses UTF-16 for long filenames, but FAT variants are limited to ASCII, restricting portability for international content.[52] Case insensitivity in Windows can cause overwrites or errors on case-sensitive systems, while reserved names like "CON" may prevent file creation on Windows even if valid elsewhere.[52] Special names facilitate navigation without explicit path construction. In POSIX systems, every directory contains two implicit entries: a single dot (.) representing the directory itself, and double dots (..) referring to its parent directory, enabling relative traversal without knowing absolute locations.[56] These are not ordinary files but standardized directory entries present in all non-root directories. Filenames starting with a single dot (e.g., ".hidden") are conventionally treated as hidden, often omitted from default listings unless explicitly requested.[56]Storage Allocation and Space Management
File systems allocate storage space to files using methods that determine how disk blocks are assigned, each with trade-offs in performance, space efficiency, and complexity. Contiguous allocation stores an entire file in consecutive disk blocks, enabling efficient sequential reads and writes since only the starting block address needs to be recorded; however, it requires knowing the file size in advance, leads to external fragmentation as free space becomes scattered, and makes file extension difficult without relocation.[57] This method was common in early systems but is less prevalent today due to its inflexibility.[57] Linked allocation, in contrast, organizes file blocks as a linked list where each block contains a pointer to the next, allowing files to grow dynamically without pre-specifying size and avoiding external fragmentation entirely.[57] The directory entry stores only the first block's address, and the last block points to null; this approach, used in the File Allocation Table (FAT) system, supports easy insertion and deletion but imposes overhead for random access, as traversing the chain requires reading multiple blocks, and a lost pointer can render the rest of the file inaccessible.[57] Indexed allocation addresses these limitations by using a dedicated index block or structure—such as the inode in Unix-like file systems—that holds pointers to all data blocks, facilitating both sequential and random access with O(1) lookup after the initial index fetch.[57] For large files, indirect indexing extends this by pointing to additional index blocks, supporting files far beyond direct pointer limits; this method, employed in systems like ext4, incurs metadata overhead but provides flexibility for varying file sizes and reduces access latency compared to linked schemes.[57][58] Free space is tracked using structures like bitmaps or linked lists to identify available blocks efficiently. Bit vector (bitmap) management allocates one bit per disk block—0 for free, 1 for allocated—enabling quick scans for free space and allocations in constant time, though it consumes storage equal to the disk size divided by 8 bits per byte; for a 1TB disk with 4KB blocks, this equates to about 32MB for the bitmap.[57] Linked free lists chain unused blocks via pointers within each block, minimizing auxiliary space on mostly full disks but requiring linear-time searches for free blocks, which can degrade performance on large volumes.[57] Block size selection, often 4KB as the default in ext4, balances these: smaller blocks (e.g., 1KB) reduce internal fragmentation for tiny files by wasting less partial space, while larger blocks (e.g., 64KB) lower per-block metadata costs and boost I/O throughput for sequential operations on big files, though they increase slack space in undersized files.[58] Advanced techniques enhance allocation efficiency for specific workloads. Pre-allocation reserves contiguous blocks for anticipated large files via system calls like fallocate in POSIX-compliant systems, marking space as uninitialized without writing data to speed up future writes and mitigate fragmentation; this is supported in file systems such as ext4, XFS, and Btrfs, where it allocates blocks instantly rather than incrementally.[59] Sparse files further optimize by logically representing large zero-filled regions ("holes") without physical allocation, storing only metadata for these gaps and actual data blocks for non-zero content; when read, holes return zeros transparently, conserving space for sparse datasets like databases or virtual machine images, as implemented in NTFS and ext4.[60][59] Overall space management incurs overhead from metadata and reservations, limiting usable capacity. Usable space can be calculated as total capacity minus (metadata structures size plus reserved blocks); in ext4, for instance, 5% of blocks are reserved by default for root privileges to prevent fragmentation during emergencies, contributing to typical overhead of 5-10% alongside inode and journal metadata.[58]Fragmentation and Optimization
Fragmentation in file systems refers to the inefficient allocation and organization of data blocks, leading to wasted space and degraded performance. There are two primary types: internal fragmentation, which occurs when allocated blocks contain unused space (known as slack space), particularly in the last partial block of a file, and external fragmentation, where file blocks are scattered across non-contiguous locations on the storage medium, or free space becomes interspersed with allocated blocks, hindering contiguous allocation. Internal fragmentation arises from fixed block sizes that do not perfectly match file sizes, resulting in wasted space within blocks; for example, using 4 KB blocks for a 1 KB file wastes 3 KB per such allocation. External fragmentation, on the other hand, scatters file extents, making it difficult for the file system to allocate large contiguous regions for new or growing files. The main causes of fragmentation stem from repeated file creation, deletion, growth, and modification over time, which disrupt the initial organized layout established during storage allocation. As files are incrementally extended or overwritten, blocks may be inserted in available gaps, leading to scattered placement; deletions create small free space holes that fragment the available area. These processes degrade access performance, particularly on hard disk drives (HDDs), where external fragmentation increases mechanical seek times as the read/write head must jump between distant locations to retrieve a single file. In severe cases, this can significantly slow read operations, potentially doubling the time or more compared to contiguous layouts, as observed in fragmented workloads like database accesses.[61] While initial storage allocation strategies aim to minimize fragmentation through contiguous placement, ongoing file system aging inevitably exacerbates it. To mitigate fragmentation, defragmentation tools rearrange scattered file blocks into contiguous extents, reducing seek times and improving throughput; these are typically offline processes for HDDs to avoid interrupting system use, involving a full scan and relocation of data. Log-structured file systems (LFS), introduced by Rosenblum and Ousterhout, address fragmentation proactively through append-only writes that treat the disk as a sequential log, minimizing random updates and external fragmentation by grouping related data temporally; this approach achieves near-full disk bandwidth utilization for writes (65-75%) while employing segment cleaning to reclaim space from partially filled log segments. In modern storage, solid-state drives (SSDs) benefit from optimizations like the TRIM command, which informs the drive controller of deleted blocks to enable efficient garbage collection and wear leveling, preventing performance degradation from fragmented invalid data without the need for traditional defragmentation. Additionally, copy-on-write (COW) mechanisms in file systems such as Btrfs and ZFS avoid in-place updates that exacerbate external fragmentation in traditional systems, instead writing modified data to new locations to preserve snapshots and integrity, though they require careful management to control free space fragmentation over time.Access and Security
Data Access Methods
File systems provide mechanisms for applications to read and write data through structured interfaces that abstract the underlying storage. The primary data access methods include byte stream access, which treats files as continuous sequences of bytes suitable for unstructured data like text or binaries, and record access, which organizes data into discrete records for structured retrieval, often used in database or legacy mainframe environments. These methods are implemented via system calls and libraries that handle low-level operations, incorporating buffering and caching to optimize performance by reducing direct disk I/O.[62] Byte stream access is the dominant model in modern operating systems, where files are viewed as an undifferentiated sequence of bytes that can be read or written sequentially or randomly via offsets. In POSIX-compliant systems, this is facilitated by system calls such asopen(), read(), and write(), which operate on file descriptors to transfer specified numbers of bytes between user buffers and the file. For example, read(fd, buf, nbytes) retrieves up to nbytes from the file descriptor fd into buf, advancing the file offset automatically for sequential access or allowing explicit seeking with lseek() for random positioning; this model is ideal for text files, executables, and other binary data where no inherent structure is imposed by the file system.[62][63]
In contrast, record access treats files as collections of fixed- or variable-length records, enabling structured retrieval by key or index rather than byte offset, which is particularly useful for applications requiring efficient random access to specific entries. This method is prominent in mainframe environments like IBM z/OS, where access methods such as Virtual Storage Access Method (VSAM) organize records in clusters or control intervals, supporting key-sequenced, entry-sequenced, or relative-record datasets for indexed lookups without scanning the entire file. For instance, VSAM's key-sequenced organization allows direct access to a record via its unique key, mapping it to physical storage blocks for quick retrieval in database-like scenarios.[64][65] Indexed Sequential Access Method (ISAM), an earlier technique, similarly uses indexes to facilitate record-oriented operations, though it has been largely superseded by more advanced structures in contemporary systems.[66]
Application programming interfaces (APIs) bridge these access methods with user code, often layering higher-level abstractions over system calls for convenience and efficiency. In C, the standard library functions like fopen(), fread(), and fwrite() create buffered streams (FILE* objects) that wrap POSIX file descriptors, performing user-space buffering to amortize I/O costs—typically in blocks of 4KB or larger—to minimize system call overhead. For example, fopen(filename, "r") opens a file in read mode, returning a stream that fread() uses to read formatted data, with the library handling partial reads and buffer flushes transparently. This buffering contrasts with unbuffered system calls like read(), which transfer data directly without intermediate caching in user space.
To further enhance access efficiency, file systems employ caching mechanisms, primarily through a page cache maintained in RAM to store recently accessed file pages, avoiding repeated disk reads for frequent operations. In Linux, the page cache holds clean and dirty pages (modified data awaiting write-back), with the kernel's writeback threads enforcing flush policies based on tunable parameters like dirty_ratio (percentage of RAM that can hold dirty pages before forcing writes) and periodic flushes every 5-30 seconds to balance memory usage and durability. When a file is read, the kernel checks the page cache first; if a miss occurs, it allocates pages from available memory and faults them in from disk, while writes may defer to the cache until a flush threshold is met, improving throughput for workloads with locality.[67]