mdadm

mdadm is a command-line utility for creating, managing, and monitoring software RAID devices in Linux, utilizing the Multiple Devices (MD) driver within the Linux kernel to aggregate multiple physical block devices into a single virtual device with optional redundancy for fault tolerance.^[1]^[2] It supports a range of RAID levels, including LINEAR, RAID0 (striping for performance), RAID1 (mirroring for redundancy), RAID4, RAID5, RAID6 (with parity for data protection), RAID10 (combining mirroring and striping), as well as specialized modes like MULTIPATH (deprecated for new installations), FAULTY, and CONTAINER.^[1]^[3]^[4] Originally developed by Neil Brown in 2001 as the primary tool for Linux software RAID, mdadm replaced older utilities like raidtools and has been maintained as an open-source project under the GNU General Public License version 2, with its source code originally hosted on kernel.org and now primarily on GitHub.^[2]^[5] It enables key operations such as assembling arrays from existing components, building new arrays with metadata superblocks, growing or reshaping arrays, and managing device addition, removal, or failure states.^[1]^[6] mdadm also includes monitoring capabilities via a daemon mode to detect and report array degradation or failures, supporting hot-plug environments through incremental assembly.^[1]^[3] The utility handles various metadata formats for RAID configuration, with version 1.2 as the default, alongside support for older formats like 0.90 and 1.0, as well as external formats such as Intel Matrix Storage Manager (IMSM) and the Common RAID Disk Data Format (DDF) in maintenance mode.^[2]^[1] At boot time, mdadm facilitates automatic array assembly by scanning for partitions with type 0xfd or using kernel parameters for non-persistent setups, ensuring seamless integration with the kernel's MD driver.^[3] It requires a minimum kernel version of 3.10 and is included in major Linux distributions, often as part of packages like mdadm or mdraid.^[2]^[6]

Introduction

Overview

mdadm is a command-line utility for administering and monitoring software RAID arrays in Linux, serving as the primary tool for the md (multiple devices) driver stack in the kernel. It enables the creation, assembly, management, and maintenance of virtual block devices composed from multiple physical or virtual storage components, supporting redundant array configurations for data protection and performance enhancement.^[1]^[2] Introduced in the early 2000s, mdadm replaced legacy tools such as raidtools, which were used for earlier Linux software RAID implementations but lacked support for modern features like flexible metadata formats. By the mid-2000s, it had become the standard in major distributions, providing a unified interface for array operations that addressed limitations in the older utilities.^[7]^[8] mdadm constructs arrays from whole disks, disk partitions, or loopback devices, allowing flexible integration with existing storage setups. It is licensed under the GNU General Public License version 2 or later. The current stable version is 4.4, released on August 19, 2025.^[9]^[10]^[7]

History

mdadm was first released in 2001 by Neil Brown, a developer at SUSE Labs, as a modern replacement for the older raidtools utility, providing enhanced management capabilities for Linux software RAID arrays.^[11] This initial version addressed limitations in prior tools by offering a unified command-line interface for creating, assembling, and monitoring MD devices, quickly gaining adoption within the Linux community. Following its inception, mdadm transitioned to community-driven maintenance after Brown's time at SUSE, becoming integrated into major Linux distributions such as Debian, Red Hat, and Ubuntu, where it replaced legacy RAID management software.^[12] Key milestones in mdadm's development include the addition of support for partitionable arrays in the Linux kernel 2.6 series (2003–2004), enabling RAID devices to be partitioned like regular block devices for greater flexibility in storage configurations. In 2008, with kernel 2.6.27, external metadata formats were introduced, allowing mdadm to interoperate with hardware RAID controllers by storing RAID information outside the array data area.^[13] TRIM support for SSDs arrived in kernel 3.7 (2012), permitting discard commands to propagate through RAID layers to optimize solid-state drive performance and longevity.^[14] Neil Brown, who had maintained mdadm for over two decades, stepped back from active involvement, leading to a transition in leadership. On December 14, 2023, Mariusz Tkaczyk was announced as the new lead contributor and maintainer, ensuring continued development and bug fixes.^[15] Deprecations marked shifts in focus, with linear mode—used for simple concatenation of devices—marked for deprecation in the Linux kernel due to low usage and redundancy with alternatives like dm-linear, and the md-linear module fully removed in Linux kernel 6.8 (March 2024).^[16] To facilitate ongoing collaborative development, the project shifted its primary repository to the md-raid-utilities organization on GitHub in late 2023, incorporating continuous integration and broader community contributions. Version 4.4 (August 2025) introduced features like custom device policies and improved self-encrypting drive (SED) support for IMSM metadata.^[2]^[9]

Configurations

RAID Levels

mdadm supports several standard RAID levels through the Linux md driver, enabling software-based redundancy and performance optimization across multiple block devices. These levels include RAID 0 for striping, RAID 1 for mirroring, RAID 4 with dedicated parity, RAID 5 and RAID 6 for distributed parity, and RAID 10 combining striping and mirroring.^[1]^[3] Each level balances capacity, performance, and fault tolerance differently, with mdadm handling array creation, management, and recovery. The following table summarizes the key characteristics of these RAID levels as implemented in mdadm:

RAID Level	Description	Minimum Devices	Capacity	Fault Tolerance
RAID 0	Striping without redundancy for maximum performance	2	Sum of all device capacities	None
RAID 1	Mirroring for data duplication across devices	2	Capacity of the smallest device	Up to 1 failure per mirror set
RAID 4	Striping across data devices with a dedicated parity disk	3	(N-1) × capacity of smallest data device, where N is the number of devices	1 failure
RAID 5	Striping with distributed parity across all devices	3	(N-1) × capacity of smallest device, where N is the number of devices	1 failure
RAID 6	Striping with double distributed parity for enhanced protection	4	(N-2) × capacity of smallest device, where N is the number of devices	2 failures
RAID 10	Striping of mirrored pairs for combined performance and redundancy	4	(N/2) × capacity of smallest device, where N is the number of devices	Up to 1 failure per mirror pair

In RAID 0, data is distributed evenly across devices to achieve high throughput but offers no protection against failures, making it suitable for temporary or non-critical storage.^[3] RAID 1 provides full redundancy by duplicating data, ensuring availability if one device fails, though at the cost of half the total capacity in a two-device setup.^[3] RAID 4 dedicates one device to parity calculations, allowing striping on the remaining devices while tolerating a single failure, but the parity disk can become a bottleneck during writes.^[3] RAID 5 distributes parity information across all devices, improving write performance over RAID 4 and supporting a minimum of three devices for efficient overhead distribution.^[3] RAID 6 extends this by using dual parity, enabling tolerance of two concurrent failures and requiring at least four devices, which is valuable for large-capacity drives prone to multiple errors.^[3] RAID 10 stripes data across mirrored pairs, delivering RAID 0-like performance with RAID 1 redundancy, though it requires even numbers of devices starting from four.^[3] Chunk size, a configurable parameter in mdadm for levels involving striping (RAID 0, 4, 5, 6, 10), influences performance and must be a power of 2, with defaults around 512 KiB.^[3]^[1] To create an array, mdadm uses the --create option with parameters specifying the level, number of devices, and components; for example, a RAID 5 array on three devices can be created as mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sd[abc], optionally including --chunk=512 for stripe size.^[1] Similar syntax applies to other levels, adjusting --level and --raid-devices accordingly, such as --level=10 --raid-devices=4 for RAID 10.^[1] mdadm RAID arrays integrate seamlessly with modern filesystems like XFS or Btrfs, allowing RAID-aware setups where the filesystem is created directly on the assembled /dev/mdX device for optimized storage management.^[17]

Non-RAID Modes

mdadm supports non-RAID modes that provide basic device aggregation without data redundancy or striping, contrasting with traditional RAID levels focused on performance or fault tolerance.^[4] These modes include linear concatenation, multipath aggregation, and the faulty mode for testing, which are useful for simple storage extension, path redundancy, or simulating failures in specific scenarios.^[3]

Linear Mode

Linear mode in mdadm concatenates multiple block devices into a single logical volume, presenting them as one contiguous address space where data is written sequentially from the first device to the last.^[4] The total capacity equals the sum of the individual device sizes, with no overhead for parity or mirroring, making it suitable for extending storage volume without redundancy, such as combining drives of varying sizes to create a larger filesystem.^[3] For example, it can span data across disks in a JBOD-like setup for archival purposes where data loss on failure is acceptable.^[4] To create a linear array, the command mdadm --create /dev/md0 --level=linear --raid-devices=2 /dev/sda /dev/sdb assembles the devices, writing metadata superblocks to enable persistence across reboots.^[4] This mode requires at least one device but supports multiple, and it operates without striping, so read/write performance remains limited to the speed of the active device.^[3] Linear mode has been deprecated due to lack of active development and maintenance, with the kernel module md-linear removed in Linux kernel version 6.8 released in March 2024.^[18] As a result, new linear arrays cannot be created or assembled on kernels 6.8 and later, rendering the mode fully phased out for modern systems; alternatives like device-mapper linear targets are recommended for concatenation needs.^[19]

Multipath Mode

Multipath mode aggregates multiple identical paths to the same underlying physical storage device, providing fault tolerance by routing I/O through available paths and failing over on individual path errors.^[4] It requires a minimum of two devices representing redundant paths, such as in SCSI or Fibre Channel setups, and ensures continuous access by marking failed paths as spare while using active ones.^[3] This mode is particularly useful for high-availability environments where path redundancy prevents downtime from cable or controller failures, without providing data-level protection.^[4] Creation follows a similar syntax to other modes, for instance, mdadm --create /dev/md0 --level=multipath --raid-devices=2 /dev/sda /dev/sdb, which initializes the array with metadata identifying the paths.^[4] The kernel handles path selection and failover transparently, but the mode does not support striping or redundancy beyond path aggregation.^[3] Multipath mode was deprecated in mdadm, with the md-multipath kernel module removed in Linux kernel 6.8 (March 2024).^[18] New installations should use the more robust device-mapper multipath (DM-Multipath) subsystem, which offers advanced policy-based routing and broader hardware compatibility.^[4]

Faulty Mode

Faulty mode in mdadm is a special personality designed for testing and development, which immediately marks the device as faulty upon array creation to simulate hardware failures in RAID configurations.^[4] It requires at least one device and does not provide any data aggregation or redundancy; instead, it allows users to test failure handling, such as how arrays respond to degraded states or spare activation.^[3] For example, it can be used to verify resync processes or alert mechanisms without risking real hardware.^[4] To create a faulty array, the command mdadm --create /dev/md0 --level=faulty --raid-devices=1 /dev/sda initializes the device in faulty state, writing appropriate metadata.^[4] This mode supports configurable failure behaviors, such as transient or persistent read/write errors, but offers no practical storage utility beyond testing.^[3] Faulty mode has been deprecated due to limited use and lack of maintenance, with the kernel module md-faulty removed in Linux kernel version 6.8 (March 2024).^[18] As a result, faulty arrays cannot be created or assembled on kernels 6.8 and later; for testing failure scenarios, alternatives like manual device marking via mdadm or device-mapper fault injection tools are recommended.^[19]

Features

Metadata Management

mdadm manages metadata for RAID arrays through superblocks that store critical configuration data, enabling array identification, assembly, and reconstruction after device failures or system reboots. This metadata includes details such as the array's UUID, RAID level, device roles, and synchronization states, which allow the kernel's MD driver to recognize and activate arrays without relying on external configuration files. Internal metadata formats are the default, using superblock versions ranging from 0.90 to 1.2, while external formats provide compatibility with hardware RAID controllers.^[4]^[3] The original metadata version 0.90, introduced with early MD RAID implementations, places the superblock at the end of each component device, requiring 64 KiB of reserved space at the end of the device to accommodate the 4 KiB superblock in a 64 KiB aligned block and support up to 28 devices per array with a 2 TiB limit per device. Version 1.0 improves upon this by also locating the superblock at the device end but adds support for checkpointing during resynchronization and removes some legacy restrictions, making it suitable for larger arrays. Version 1.1 shifts the superblock to the start of the device for better compatibility with partitioned disks, while version 1.2—the current default—positions it 4 KiB from the start, offering enhanced flexibility with fewer size limits and explicit support for write-intent bitmaps. These internal superblocks are typically compact, with version 1.x formats using a fixed 4 KiB size to minimize data displacement.^[4]^[3]^[20] External metadata formats, supported since Linux kernel 2.6.27, store configuration data separately from the array devices, often on a dedicated partition or container. This approach enhances interoperability with firmware-based "Fake RAID" systems, such as Intel's Matrix Storage Manager (IMSM), and conforms to the Disk Data Format (DDF) standard for enterprise environments, allowing mdadm to manage arrays created by hardware controllers without proprietary tools. DDF, though deprecated in favor of IMSM for Intel platforms, enables container-based arrays where metadata resides externally, facilitating migration between software and hardware RAID setups.^[4]^[3] To inspect and decode metadata, mdadm provides the --examine option, which parses superblocks on component devices to display array details like UUID, level, and member states, aiding in manual verification and troubleshooting. For arrays with bitmaps, the --examine-bitmap flag extracts recovery information, such as dirty block locations. Bitmap support, integrated in version 1.2 and later, tracks modified blocks during unclean shutdowns using an internal write-intent bitmap stored within the superblock and replicated across devices; this optimizes resynchronization by resuming only from the last checkpoint, with a default chunk size of 64 MiB that can be adjusted for performance. Bitmaps can be added or removed post-creation and are essential for reducing recovery times in large arrays.^[4]

Booting and Initialization

Mdadm enables booting from software RAID arrays by supporting the assembly of arrays early in the boot process, primarily through integration with the initramfs and bootloader configurations. For BIOS-based systems, the /boot partition is typically configured on a RAID 1 array to ensure compatibility, as bootloaders like GRUB can read individual member partitions as standard filesystems without needing full RAID awareness.^[3]^[21] This setup allows the kernel and initramfs to be loaded from mirrored devices, providing redundancy for critical boot files. Integration with the initramfs is essential for automatic array assembly during boot. The /etc/mdadm.conf file specifies array configurations for auto-assembly, and tools like dracut (on Fedora and RHEL derivatives) or initramfs-tools (on Debian and Ubuntu) include hooks to scan devices, include the mdadm binary, and activate arrays using commands like mdadm --assemble --scan before mounting the root filesystem.^[22] For example, in initramfs-tools, hook scripts in /usr/share/initramfs-tools/hooks/ copy mdadm and its dependencies into the initramfs image, ensuring arrays are assembled prior to proceeding with the boot sequence. Regenerating the initramfs after changes to mdadm.conf is required to incorporate updated configurations. In legacy systems with kernels prior to version 2.6, special handling was necessary for booting from RAID arrays, often requiring kernel command-line parameters like md= to manually specify devices and RAID levels without relying on superblock metadata.^[3] Modern kernels since 2.6.9 automatically detect and assemble arrays via embedded metadata during boot, provided the md driver is compiled into the kernel or loaded as a module, simplifying initialization without explicit parameters.^[3]^[23] For UEFI systems, mdadm supports booting from RAID arrays using GPT partitioning, where the EFI System Partition (ESP) can be mirrored in RAID 1 with internal metadata formats such as version 0.90 or 1.0 placed at the end of devices to avoid interfering with the partition table. This external metadata also facilitates handoff from Fake RAID (hardware-assisted) configurations, allowing mdadm to take over array management post-bootloader. Bootloaders like GRUB or systemd-boot are configured similarly to BIOS setups, referencing the assembled md device (e.g., /dev/md0) or UUIDs for the root and boot partitions.^[21] Troubleshooting boot issues with degraded arrays often involves forcing assembly in the initramfs rescue shell using mdadm --assemble --scan --run, which starts the array despite missing or failed devices, enabling the system to boot in a reduced redundancy state for subsequent repairs. This option overrides default safety checks that prevent starting degraded arrays, but it should be used cautiously to avoid data loss. During scanning, mdadm relies on metadata formats like those detailed in the Metadata Management section to identify and validate arrays.

Monitoring Capabilities

mdadm provides robust tools for runtime monitoring of software RAID arrays, enabling administrators to assess health, detect issues, and receive timely alerts. The mdadm --detail command displays comprehensive status information for an active array, including its operational state (such as active or degraded), the count of active devices, failed disks, and available spares, as well as progress metrics for ongoing processes like resynchronization or rebuilding.^[1] For examining individual components without an assembled array, mdadm --examine retrieves metadata from device superblocks, revealing array UUID, RAID level, and role assignments to verify consistency and potential faults.^[1] These commands are essential for manual status checks and scripting periodic health verifications. Kernel-level insights are accessible via the /proc/mdstat interface, which exposes real-time statistics for all MD arrays, including device membership, activity levels, and detailed synchronization progress (e.g., percentage complete and speed).^[3] Executing cat /proc/mdstat yields output like active disks, recovery status, and bitmap usage, making it a lightweight method for integration into monitoring scripts or dashboards without invoking user-space tools.^[6] Continuous event-based monitoring is handled by mdadm --monitor, which operates as a background daemon to poll specified arrays (supporting RAID levels 1, 4, 5, 6, and 10) for state changes such as device failures, spares activation, or degradation.^[1] Alerts can be configured for delivery via email by setting the MAILADDR option in /etc/mdadm.conf (e.g., MAILADDR admin@[example.com](/page/Example.com)), triggering notifications for events like Fail, DegradedArray, RebuildStarted, or DeviceDisappeared.^[6] For advanced alerting, the --program or --alert options invoke custom scripts, which may integrate with systems like SNMP by generating traps upon detection of issues.^[1] Syslog output is also supported via the --syslog flag for logging events to system logs.^[1] The mdmpd daemon, once part of mdadm for monitoring multipath device failures and path recovery, is deprecated since kernel 2.6.10-rc1 in 2004 and has been superseded by Device Mapper Multipath (DM-Multipath) for such functionality. mdadm complements disk-level monitoring tools like smartmontools, where the latter's smartd daemon performs predictive failure analysis on individual drives (e.g., via S.M.A.R.T. attributes), allowing early detection of issues that could impact array integrity before mdadm reports degradation. Write-intent bitmaps, enabled with mdadm --grow --bitmap=internal or external files, track modified regions during unclean shutdowns to accelerate resyncs, with their status and effectiveness observable in /proc/mdstat during recovery operations.^[3]

Usage and Management

Command-Line Interface

The mdadm command-line interface provides a flexible syntax for managing Linux software RAID arrays, following the general structure mdadm [mode] [options] <devices>, where the mode specifies the primary operation, options modify behavior, and devices list the relevant block devices or array identifiers.^[10] Common modes include --create for initializing a new array with metadata superblocks and activating it, --assemble for scanning and activating existing arrays from component devices, and --manage for runtime operations such as adding or removing devices from an active array.^[10] This structure allows users to perform array creation, assembly, and maintenance in a single tool, reducing the need for multiple utilities.^[2] Key global options enhance usability across modes, such as --verbose (or -v), which increases output detail and can be specified multiple times for greater verbosity; --config (or -c), which points to a configuration file like the default /etc/mdadm.conf for array definitions and scanning rules; and --help (or -h), which displays general usage information or mode-specific details when invoked with a mode.^[10] These options apply regardless of the selected mode, enabling consistent control over logging, configuration sourcing, and quick reference access.^[10] The configuration file /etc/mdadm.conf uses a simple, keyword-based format to define arrays and settings for automatic detection and monitoring, with sections starting with keywords like ARRAY or MAILADDR.^[24] The ARRAY section specifies an array's device path and identity tags, such as ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371, allowing mdadm to identify and assemble components during scans.^[24] The MAILADDR section configures email notifications for monitoring, limited to a single address like MAILADDR [email protected], which mdadm uses in --monitor mode with --scan for auto-detection of issues.^[24] Comments begin with #, and lines can span with leading whitespace, ensuring readability and flexibility.^[24] Installation of mdadm is straightforward on most Linux distributions via package managers; for example, on Debian-based systems, it is available as apt install mdadm, while on Red Hat-based systems, yum install mdadm or dnf install mdadm suffices.^[2] For custom builds, the source code can be compiled from the official GitHub repository using make commands, including make install-bin for binaries and make install-systemd for service integration.^[2] Error handling in mdadm includes standardized exit codes to indicate operation outcomes, varying by mode; for instance, in miscellaneous modes, code 0 denotes normal success, 1 indicates a failed device, 2 signals an unusable array, and 4 represents a general error, such as invalid arguments.^[10] These codes allow scripts and administrators to detect and respond to issues programmatically, with verbose output providing additional diagnostic details when enabled.^[10]

Assembly and Disassembly

Assembly of an MD array activates an existing array from its component devices, making it available as a block device for use. The primary command for manual assembly is mdadm --assemble followed by the target MD device name and the component devices, such as mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1. This mode verifies that the components match the array's metadata before activating it.^[4] To identify and assemble arrays using unique identifiers, the --uuid option can be specified, allowing assembly even when device paths have changed; for example, mdadm --assemble /dev/md0 --uuid=12345678-1234-1234-1234-1234567890ab /dev/sda1 /dev/sdb1 assembles the array by matching the provided UUID from the superblock metadata.^[4] In cases of missing disks, the missing keyword can substitute for unavailable devices, enabling degraded assembly if redundancy permits, as in mdadm --assemble /dev/md0 --uuid=12345678-1234-1234-1234-1234567890ab /dev/sda1 missing.^[4] Automatic assembly occurs at runtime through the --assemble --scan option, which scans for arrays defined in the configuration file /etc/mdadm.conf or detects them via device metadata and udev rules for hotplug events.^[4] This mode assembles all eligible arrays without explicit device listing, relying on the config file's ARRAY lines that specify device names, UUIDs, or other identifiers.^[4] For degraded arrays where fewer components are available than required for full redundancy, the --run option forces activation if data accessibility is possible, such as starting a RAID1 array with only one disk or a RAID5 with one missing; for instance, mdadm --assemble --run /dev/md0 /dev/sda1 overrides safety checks to begin operation in degraded mode, potentially using spare devices if configured.^[4] The --force option can also be used alongside --run to assemble despite mismatched or outdated metadata.^[4] Disassembly deactivates an active array, stopping its operation and releasing the underlying devices. The command mdadm --stop /dev/md0 safely stops the specified array if it is not in use by the system.^[4] For arrays that are stuck or mounted, the --force option can compel the stop, as in mdadm --stop --force /dev/md0, though this risks data inconsistency if filesystems remain active.^[4] To stop all active arrays, --stop --scan can be employed, mirroring the scan-based assembly process.^[4] Boot-time assembly procedures build on these runtime mechanisms but incorporate initramfs integration for early activation.^[4]

Maintenance Operations

Mdadm provides several commands for maintaining RAID arrays after initial setup, allowing administrators to modify array composition, repair issues, and optimize performance without downtime in many cases. These operations leverage the Linux kernel's MD driver capabilities and require the array to be active unless specified otherwise. For instance, adding or removing devices can be performed hot-plug style on redundant arrays like RAID1 or RAID5.^[1] To add a device to an active array, the --add option is used, which integrates the new device either as a spare or by re-adding a previously removed one, triggering a resync if necessary. The command mdadm /dev/md0 --add /dev/sdd adds the device /dev/sdd to the array /dev/md0, provided the array has redundancy to tolerate potential inconsistencies during integration.^[1] Similarly, removing a failed or spare device uses --remove, as in mdadm /dev/md0 --remove /dev/sdd, which marks the device as faulty and removes it from the array metadata without affecting data availability in redundant configurations.^[1] Reshaping arrays enables changes to the RAID level or capacity expansion, using the --grow option, which requires specific kernel support for the desired transformation. For example, converting a RAID5 array to RAID6 involves mdadm --grow /dev/md0 --level=6 --backup-file=/backup, where the backup file stores critical metadata to ensure safe operation during the reshape process.^[1] Size expansion, such as growing the array by adding devices and extending the filesystem, also uses --grow --size=, but this demands kernel versions supporting the feature, such as 2.6.17 or later for RAID5 growth.^[1] These operations proceed incrementally, allowing continued array access, though performance may degrade temporarily. Repair and resynchronization maintain data integrity by reintegrating devices or synchronizing contents across members. The --re-add option reintegrates a previously removed device, leveraging write-intent bitmaps if enabled to accelerate recovery: mdadm /dev/md0 --re-add /dev/sdd.^[1] Automatic resync occurs upon array assembly if discrepancies are detected, ensuring all devices reflect the current data state.^[1] Bitmap management enhances recovery efficiency by tracking unsynchronized regions, and can be enabled or modified during growth operations. To add an internal bitmap, the command mdadm --grow /dev/md0 --bitmap=internal is issued, which stores the bitmap within the array metadata to minimize resync times after unclean shutdowns.^[1] Disabling it uses --bitmap=none, useful for arrays without frequent interruptions.^[1] Scrubbing verifies array consistency by checking for data corruption, with mdadm --wait /dev/md0 monitoring the progress of full array checks or ongoing resyncs until completion.^[1] This operation, often scheduled periodically, integrates with monitoring tools to report resync progress.^[1]

Technical Details

Device Naming Conventions

mdadm employs specific naming conventions for RAID devices to ensure consistent identification across system operations and reboots. By default, non-partitioned RAID arrays are named using the format /dev/md<n>, where <n> is a decimal number ranging from 0 to 255, corresponding to the minor device number assigned by the kernel.^[1] For persistent superblock-based naming, particularly with version 1.0 or later metadata that supports UUIDs, arrays use /dev/md_d<n>, where <n> matches the minor number to provide a stable reference independent of the order of device discovery.^[1] For partitionable arrays, which require metadata version 1.0 or higher, partitions are denoted by appending p<m> to the device name, resulting in formats such as /dev/md_d<n>p<m> or /dev/md<n>p<m>, where <m> indicates the partition number.^[1] This allows standard partitioning tools like fdisk or parted to operate on the array as if it were a single disk. Additionally, custom persistent names can be assigned during array creation with the --name= option, leading to device paths like /dev/md/home under the /dev/md/ directory.^[1] Each RAID array is assigned a unique 128-bit UUID upon creation, randomly generated unless specified with --uuid=, and an optional label (name) stored in the superblock for version-1 metadata.^[1] These identifiers facilitate reliable assembly and configuration; for instance, the command mdadm --detail --scan outputs ARRAY lines in a format suitable for /etc/mdadm.conf or /etc/fstab, such as ARRAY /dev/md0 UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx name=home, enabling persistent references by UUID or label rather than volatile device paths.^[25] To mitigate the volatility of numeric device names like /dev/md0, which can shift based on assembly order, mdadm integrates with udev for creating symbolic links. Udev rules generate symlinks in directories such as /dev/md/by-uuid/<UUID> pointing to the actual device node, allowing applications to reference arrays by stable identifiers.^[4] Similarly, links in /dev/md/by-label/<name> provide access by label. This mapping ensures robustness in dynamic environments.^[3] mdadm's naming scheme is compatible with layered storage tools, such as LVM, where RAID devices can serve as physical volumes (PVs) using their persistent names or UUID symlinks for volume group creation. Device-mapper can also stack on mdadm arrays, treating them as underlying block devices in multipath or snapshot configurations.^[1]

RAID 1 Implementation Specifics

mdadm implements RAID 1 as a mirroring scheme where data is duplicated bit-for-bit across all member devices in the array, ensuring redundancy by maintaining identical copies on each. This supports configurations with 2 to N devices, where N is practically limited by kernel memory and I/O constraints rather than a hard cap. The mirroring process occurs synchronously during writes, with the kernel's MD driver handling the replication to all active legs of the array.^[3] The synchronization process in RAID 1 begins with an initial resync upon array assembly if the devices are not already in sync, copying data sector-by-sector from a reference device to the others starting from sector 0. This can be manually initiated or forced, for example using the command mdadm --assemble --update=resync /dev/md0, which marks the array as dirty to trigger the operation. For ongoing maintenance, incremental resyncs leverage write-intent bitmaps—stored on the devices or a separate one—to track modified regions, allowing recovery to focus only on discrepancies after interruptions like power failures, significantly reducing resync time compared to full scans.^[3]^[1] Failure handling in mdadm's RAID 1 operates automatically: upon detecting errors via I/O timeouts or checksum failures, the kernel marks the affected device as faulty in /sys/block/mdX/md/dev-YYY/state and redirects operations to the surviving mirrors, maintaining array availability in degraded mode. Recovered or replacement devices can be reintegrated using mdadm --re-add /dev/md0 /dev/sdZ, which initiates a targeted resync to rebuild the mirror without full data loss, provided a bitmap is enabled for efficiency.^[3]^[1] Performance characteristics of RAID 1 in mdadm include a write penalty, where each write operation is replicated to all mirrors—typically doubling the I/O load for a two-device array—potentially halving write throughput relative to a single device. Reads, however, benefit from balancing across all in-sync devices, distributing requests to maximize aggregate bandwidth and reduce latency, with the kernel selecting legs based on heuristics like recency and queue depth.^[3] Edge cases in RAID 1 implementation address recovery and resource management, such as using bitmaps (configured via mdadm --grow /dev/md0 --bitmap=internal) to enable crash-safe incremental resyncs, limiting full scans to rare scenarios. For multi-device arrays, synchronization rates are throttled to prevent overwhelming the system, adjustable through /sys/block/mdX/md/sync_speed_min and /sys/block/mdX/md/sync_speed_max (in KiB/s), with defaults balancing speed and stability; for instance, the minimum rate ensures progress even under load.^[3]