OverlayFS
OverlayFS is a union filesystem implementation in the Linux kernel that merges multiple underlying directories or filesystems into a single, unified view presented to users and applications. It operates by overlaying a writable upper directory on top of one or more read-only lower directories, ensuring that modifications, creations, or deletions affect only the upper layer while reads prioritize the upper layer and fall back to lower layers as needed. A dedicated work directory, located on the same filesystem as the upper, handles temporary files during operations like copy-up.[1] Originally developed as a prototype to provide efficient union filesystem functionality, OverlayFS was merged into the mainline Linux kernel with version 3.18, released on December 7, 2014. This integration enabled lightweight stacking of filesystems without the overhead of full copies, supporting copy-on-write semantics where unchanged files are referenced from lower layers. Over subsequent kernel releases, enhancements have included multiple lower layers (supported since kernel 3.18), data-only overlays (introduced in kernel 6.8) for metadata separation, and features like unique inode numbers via the "xino" option to improve compatibility with tools expecting stable identifiers.[2][1] Key mechanisms in OverlayFS include whiteouts, special files that hide corresponding entries in lower layers, and opaque directories, which prevent traversal into lower subdirectories to maintain isolation. These, along with extended attribute handling for overlay-specific metadata (prefixed with "trusted.overlay."), ensure consistent behavior across layers, including support for nesting and sharing. The filesystem also accommodates advanced scenarios like NFS exports and fs-verity integration for signed files, making it robust for production environments.[1] OverlayFS is notably employed in container orchestration platforms like Docker, where the overlay2 storage driver leverages it to stack image layers efficiently, supporting up to 128 layers for optimized build and runtime performance. In live operating systems and bootable media, it allows non-destructive modifications over read-only base filesystems, preserving the original integrity. Additionally, in embedded Linux systems, OverlayFS enhances data protection by isolating writes to volatile storage like RAM, preventing corruption on flash media during power failures or updates.[3][4][5]Overview
Definition and Purpose
OverlayFS is a union mount filesystem implementation provided as a module in the Linux kernel. It enables the merging of multiple directories or filesystems—referred to as layers—into a single, unified view presented to users and applications. In this setup, an upper layer, typically writable, is overlaid onto one or more lower layers, which are often read-only; objects in the upper layer take precedence, while those absent from the upper layer are transparently accessed from the lower layers. A dedicated work directory, on the same filesystem as the upper layer, is used for temporary files during operations such as copy-up.[1] Union filesystems, the broader concept underlying OverlayFS, allow separate directory trees or filesystem branches to be combined transparently, creating a composite namespace where reads prioritize the most recent or topmost branch, and writes are directed to a designated writable branch without modifying the originals. This approach facilitates copy-on-write semantics, ensuring that underlying data remains intact while changes are isolated. OverlayFS implements this paradigm natively within the kernel, offering a lightweight and performant alternative to earlier user-space or out-of-tree solutions.[6] The primary purpose of OverlayFS is to support read-write operations on top of immutable lower layers, preserving the lower filesystem's state while directing all modifications to the upper layer for efficient, non-destructive updates. This design is ideal for scenarios demanding layered filesystem management without altering base storage, such as in live operating systems where temporary user changes overlay a read-only root filesystem. It originated to fulfill the need for robust, in-kernel overlay functionality in Linux, resolving challenges in environments like bootable media and emerging container workflows by providing a standardized, kernel-integrated union filesystem.[1][7]Key Features
OverlayFS supports stacking multiple read-only lower directories beneath a single writable upper layer, enabling the merging of hierarchical filesystem views. For instance, the mount optionlowerdir=/lower1:/lower2 allows files from /lower2 to override those in /lower1 in the merged namespace, with the upper layer capturing all modifications. This multi-layer capability facilitates efficient composition of filesystem snapshots or distributions without duplicating data.[1]
The filesystem maintains POSIX compliance in its merged view, preserving standard file permissions, ownership, and semantics for operations like reading, writing, and renaming across layers. While it adheres to POSIX for most behaviors, such as consistent directory traversals and file locking, certain optimizations like read-only access to lower layers without updating access times represent deliberate trade-offs for efficiency. This compliance ensures seamless integration with existing Linux applications and tools.[1]
Efficiency is achieved through lazy copy-up mechanisms, where read-only files from lower layers are not duplicated to the upper layer until a write operation occurs. This on-demand copying minimizes storage overhead and I/O during initial mounts or reads, with the upper layer only materializing changes as needed. Additionally, metacopy support allows initial copying of just file metadata during copy-up, deferring data blocks until accessed, further optimizing performance for large files.[1]
OverlayFS provides NFS export capability via the nfs_export=on mount option, allowing the merged filesystem to be shared over NFS while maintaining consistent views for clients. This requires underlying filesystems to support NFS exporting and uses inode indexing to handle cross-layer references reliably.[1]
Metadata handling in OverlayFS includes options for inode numbers and integrity verification. The xino=on or xino=auto option assigns unique, persistent inode numbers to overlaid files by encoding filesystem identifiers into high bits, preventing collisions in the merged view. Furthermore, support for fs-verity ensures the integrity of lower-layer files during copy-up, verifying content against digests stored in extended attributes when the verity=on or verity=require option is enabled.[1]
History
Development Origins
The development of OverlayFS originated from early 2009 discussions within the Linux kernel community about the need for a robust union filesystem, driven by the limitations of existing out-of-tree implementations like AUFS that prevented their mainline inclusion due to excessive complexity and lack of clarity in design.[6] These conversations, documented on LWN.net, emphasized the demand for a solution that could merge multiple filesystem namespaces into a unified view while minimizing modifications to the Virtual File System (VFS) layer, addressing use cases such as writable overlays on read-only media for live distributions and embedded environments.[6][8] Miklos Szeredi, a prominent kernel developer, initiated the OverlayFS project in response to these needs, submitting the initial Request for Comments (RFC) patchset in 2010 to propose a hybrid approach combining VFS-level directory handling with direct access to underlying filesystems for efficiency.[9] This design prioritized simplicity and correctness, distinguishing it from more intricate predecessors like AUFS, with features such as "copy-up" operations for writable modifications and extended attributes for managing whiteouts and opaque directories.[9][7] Experimental adoption began in 2011 when OpenWrt integrated OverlayFS into its embedded routing firmware to enable flexible overlays on resource-constrained devices.[7] Prior to mainline acceptance, the project encountered pre-integration challenges, including stability concerns around preventing unintended modifications to lower-layer filesystems and resolving locking mechanisms, which required multiple patch revisions and community scrutiny to ensure reliability.[7]Kernel Integration and Releases
OverlayFS was integrated into the mainline Linux kernel version 3.18, released in December 2014, through a pull request submitted by Miklos Szeredi.[10] This marked the filesystem's transition from out-of-tree development to official kernel support, enabling its use in production environments without custom patches. In Linux kernel 4.0, released in April 2015, OverlayFS received enhancements that facilitated its adoption as the backing storage driver for Docker's overlay2 implementation, which became the default for systems supporting OverlayFS.[3] Subsequent releases in the 4.x series, such as 4.2 and 4.9, introduced stability improvements including better handling of metadata inconsistencies and support for additional filesystems as lower layers, addressing early reliability issues reported in container workloads.[1] More recent developments have focused on advanced layering capabilities and security. Linux kernel 6.8, released in March 2024, added support for "data-only" lower layers, allowing layers to contribute solely file contents without exposing their directory structure in the merged view, which enhances privacy and efficiency in multi-layer setups like container images.[1] Kernel 6.15, released in May 2025, introduced theoverride_creds mount option, which records the calling task's credentials for accessing lower layers, mitigating privilege escalation concerns in untrusted overlays, and support for specifying layers using O_PATH file descriptors rather than paths, improving security by avoiding path traversal risks and enabling more flexible mounting in sandboxed environments.[1][11][12]
Adoption milestones include Slackware's integration of OverlayFS into its Live Edition in 2016, where it replaced older union filesystems for providing writable persistence on read-only media like CDs and USB sticks.[13] Ongoing refinements, particularly in kernels 5.x and 6.x, have optimized OverlayFS for container ecosystems, with Docker and Podman leveraging it for layered image management and runtime efficiency.[3]
Architecture
Layering and Merging
OverlayFS utilizes a layered architecture that combines a single writable upper filesystem with zero or more read-only lower filesystems to form a unified view. The upper layer serves as the primary writable component, where modifications are stored, while the lower layers provide read-only content that can be overridden by the upper layer. This structure allows the merged filesystem to prioritize objects from the upper layer, ensuring that any equivalent items in the lower layers are hidden when present above.[1] The core merging process in OverlayFS recursively unions directories from the upper and lower layers into a single coherent namespace. Directories are combined by aggregating their contents, with names from the upper directory taking precedence over those in the lower directories. For files and other non-directory objects, an item in the upper layer completely overrides its equivalent in the lower layer, while the absence of an item in the upper layer results in transparent exposure of the corresponding item from the lower layer. This union mechanism creates a seamless view for userspace applications, abstracting away the individual layer boundaries.[1] Support for multiple lower layers enables more flexible configurations, such as chaining where the upper layer overlays a stack of lower layers (e.g., upper on lower1 on lower2). Layers are searched in order from the topmost lower layer downward to the bottommost, allowing hierarchical compositions. Notably, lower layers themselves can be instances of OverlayFS, facilitating nested overlays for complex scenarios. The search order ensures that higher layers in the stack override content from those below, maintaining the prioritization principle across the entire structure.[1] To achieve view consistency, OverlayFS maintains merged directory caches that integrate name lists from all contributing layers, presenting a unified and consistent namespace to userspace processes. This caching approach hides the distinctions between layers, allowing applications to interact with the filesystem as if it were a single, monolithic entity without awareness of the underlying composition. Upper layer metadata and attributes are consistently applied in the merged view, further reinforcing the coherent presentation.[1]Special Files and Metadata Handling
OverlayFS employs special files known as whiteouts to handle deletions in its layered structure, allowing files from lower layers to be hidden without physically removing them from the underlying filesystems. A whiteout is created in the upper layer as a character device with a device number of 0/0, or alternatively as a zero-sized file bearing the extended attributetrusted.overlay.whiteout. When a file is deleted in the merged view, OverlayFS generates this whiteout, which masks the corresponding lower-layer file during directory enumeration, effectively simulating its removal while preserving the original data intact.[1]
To manage the deletion of entire directories, OverlayFS uses opaque directories, which prevent the exposure of subdirectories from lower layers. An opaque directory is marked in the upper layer with the extended attribute trusted.overlay.opaque="y", causing the directory to appear empty in the merged view and blocking any lower-layer contents from being visible or accessible. Non-directory files are inherently treated as opaque, ensuring they do not reveal underlying layers. For directories containing whiteouts, the attribute trusted.overlay.opaque="x" may be used to indicate opacity while allowing internal whiteout files, optimizing operations like readdir without unnecessary overhead. Opaque directories should not themselves contain whiteouts, as this would conflict with the deletion semantics.[1]
Metadata handling in OverlayFS prioritizes the upper layer for consistency in the merged presentation. In the merged view, inodes for directories derive their metadata solely from the upper layer if present, while non-directories may inherit from either the upper or lower layer depending on availability; the st_dev field for directories reports the overlay's device, but non-directories might reflect the underlying filesystem's device. To ensure unique and persistent inode numbers (st_ino), the mount option xino=on (or xino=auto) enables inode composition, combining the real inode number with a filesystem identifier (fsid) for uniqueness across layers, provided the underlying filesystems support NFS file handles for persistence. Without this option, inode numbers may vary and are not guaranteed to be persistent. Additionally, OverlayFS does not update access times (atime) on lower-layer files during reads, deviating from strict POSIX semantics to avoid unnecessary copy-up operations and maintain read-only integrity for lower layers.[1]
Changes to metadata, such as permissions via chmod, trigger a copy-up operation in OverlayFS, where the affected file or directory is fully copied from the lower layer to the upper layer to allow the modification. This ensures that metadata alterations are isolated to the writable upper layer without impacting read-only lower components. When the metacopy=on mount option is enabled, initial metadata changes copy only the metadata to the upper layer—marked with the trusted.overlayfs.metacopy extended attribute—while deferring data copy until an actual write access occurs, optimizing for scenarios where only attributes are modified.[1]
Implementation
Mount Options and Configuration
OverlayFS is mounted using the standard Linuxmount command with the filesystem type overlay. The basic syntax is mount -t overlay overlay -o lowerdir=/lower,upperdir=/upper,workdir=/work /merged, where /merged represents the mount point that presents the unified view of the layered filesystems.[1]
The lowerdir option specifies one or more read-only lower layers, which provide the base filesystem content; multiple directories can be stacked by separating paths with colons (e.g., lowerdir=/lower1:/lower2), evaluated from right to left for merging order. The upperdir option points to a writable upper layer where modifications, such as file creations or changes, are stored to preserve the read-only nature of lower layers. The workdir option designates a temporary directory used for internal OverlayFS operations, such as preparing copy-ups; it must be an empty directory on the same filesystem as upperdir and should be backed by a fast filesystem like tmpfs for optimal performance.[1]
Several key options allow customization of OverlayFS behavior. The redirect_dir=on option enables tracking of directory renames and hard links across layers by using extended attributes, which is disabled by default to maintain compatibility. The metacopy=on option optimizes copy-up operations by copying only file metadata (e.g., permissions and timestamps) initially, deferring actual data copy until necessary, also disabled by default. For environments requiring NFS exports, nfs_export=on ensures consistent file handles and attribute caching across NFS clients. The volatile option skips synchronization calls to the underlying filesystems, improving performance but risking data loss on crashes, making it unsuitable for persistent storage. Additionally, userxattr switches extended attribute storage from the trusted.overlay. namespace to user.overlay. for use in user namespaces where trusted attributes may not be accessible.[1]
Configuration prerequisites include ensuring the upper layer's filesystem supports trusted or user extended attributes and provides valid d_type in directory entries, as filesystems like NFS do not meet these requirements. Lower layers can be any mountable filesystem, including another OverlayFS instance, without needing write support. Recent kernel versions (e.g., 6.8+) introduce advanced features like specifying layers via file descriptors with fsconfig or data-only lower layers using double colons (e.g., lowerdir=/l1::/do1), but these build on the core options for layering read-only bases with a writable overlay. Additionally, kernel 6.18 introduced support for case-folding, enabling case-insensitive handling of files and directories to improve compatibility with certain filesystems in container environments.[1][14]
Core Operations and Behaviors
OverlayFS manages file operations by transparently merging the upper and lower layers, ensuring that the underlying filesystem structure remains unchanged while providing a unified view to applications.[1] For read operations, OverlayFS provides transparent access to files and directories, presenting objects from the upper layer when they exist there, or falling back to the lower layer otherwise.[1] This includes non-directory objects such as regular files, symbolic links, and device special files.[1] Shared memory mappings are possible for read-only access to lower-layer files, but with the caveat that subsequent modifications to the file will not be reflected in the mapping if it was opened read-only before mapping.[1] Write operations in OverlayFS trigger a copy-up mechanism on the first modification of a lower-layer file that requires write access, such as opening for read-write or truncating.[1] During copy-up, the file is copied from the lower to the upper layer, after which all subsequent writes occur on the upper copy.[1] Direct writes to the lower layer are not permitted, preserving its read-only integrity even if the lower filesystem itself is writable.[1] Renames and deletions handle layer interactions through specific mechanisms to avoid modifying the lower layer. For renames, if the source directory is on the lower layer or merged (not originally created on the upper), OverlayFS performs a copy-up of the directory entry to the upper layer before completing the rename, particularly when theredirect_dir=on mount option is enabled.[1] Deletions of lower-layer files or directories use whiteouts—special markers created on the upper layer as zero-sized regular files or character devices with specific attributes—without altering the lower layer.[1]
Additional behaviors include relaxed handling of executable files from the lower layer: opening such a file for writing or truncating it does not result in an ETXTBSY error, allowing modifications without denial due to active execution.[1] In volatile mode, enabled via the volatile mount option, changes to the upper layer are discarded upon unmount, and synchronization calls to the upper filesystem are omitted, making it suitable for non-persistent scenarios but not crash-safe.[1]