Fact-checked by Grok 2 weeks ago

Persistent memory

Persistent memory, also known as (NVM) or storage-class memory (SCM), is a byte-addressable form of that delivers the low-latency, load/store access speeds of (DRAM) while providing data persistence akin to , ensuring information remains intact even after power loss. This technology bridges the traditional divide between volatile main memory and slower, block-addressable secondary storage, enabling direct CPU access to large-scale persistent data structures without the overhead of file systems or operating system mediation. The development of persistent memory has roots in decades of research into non-volatile technologies, with early concepts dating back to the explorations of object-oriented persistent stores, but practical hardware advancements accelerated in the . Key enabling technologies include phase-change memory (PCM), which alters material states for ; spin-transfer torque magnetic random-access memory (STT-MRAM), leveraging magnetic resistance changes; (ReRAM), based on resistance variations in materials; and , a cross-point array architecture announced by and Micron in 2015. Commercial implementations, such as 's Optane DC persistent memory modules introduced around 2019, integrate these into dual in-line memory modules () formats, often paired with , with persistence ensured through software-managed flushes and hardware durability instructions to handle power failures, though discontinued Optane production in 2022. These systems support capacities up to terabytes per socket at costs lower than , while maintaining nanosecond-scale latencies only 2–3 times higher than . In computer architecture, persistent memory introduces new paradigms for data management, allowing applications to treat memory as both fast and durable, which enhances performance in domains like databases, in-memory analytics, and real-time processing. For instance, it enables direct manipulation of persistent objects via standard memory instructions, bypassing traditional I/O bottlenecks and supporting features like cache-line write-back (CLWB) and serialization (SFENCE) for durability. Programming models, such as those in the SNIA NVM specification or Intel's Persistent Memory Development Kit (PMDK), facilitate safe operations through libraries like libpmemobj, which handle transactions and failure-atomic updates. Despite its advantages, persistent memory poses challenges in ensuring crash consistency and atomicity, as data in CPU caches or buffers may not persist without explicit flushes, leading to potential inconsistencies on power failures. Studies have identified common pitfalls, such as missing instructions, in over 80% of analyzed bugs in applications, necessitating tools like Agamotto for verification and robust file systems like or optimized for byte-addressability. As of 2025, the persistent memory market continues to expand with alternatives to Optane, including CXL-enabled systems and emerging NVM technologies like advanced STT-MRAM and ReRAM, supporting growing applications in and data analytics. Ongoing research focuses on hybrid architectures, remote PM disaggregation, and energy-efficient persistence to broaden adoption beyond early datacenter uses.

Overview

Definition and Key Characteristics

Persistent memory (PMem), also known as (NVRAM) in emerging contexts, is a storage technology that provides byte-addressable, low-latency access to persistent data, enabling direct load and store operations similar to (DRAM) while retaining information across power failures. This combines the performance of with the durability of traditional non-volatile storage, allowing applications to treat large datasets as if they were in main memory without the need for intermediate buffering or . Unlike conventional storage hierarchies, PMem integrates into the processor's , facilitating fine-grained, at speeds orders of magnitude faster than block-based devices. Key characteristics of PMem include its byte-addressability, which permits granular read and write operations without block alignment restrictions, ensuring efficient data manipulation comparable to but with persistence. occurs without power, distinguishing it from volatile alternatives like , while access latencies in early implementations like Intel Optane (2019) were approximately 100-300 ns—3 to 6 times slower than 's 50-100 ns yet vastly quicker than solid-state drives. Endurance varies by underlying technology; for example, phase-change memory variants supported 10^8 to 10^12 write cycles per cell, often mitigated through wear-leveling techniques to support repeated updates. Capacities scaled from gigabytes to terabytes per module in past products, enabling larger working sets than traditional , and modules integrated onto standard memory buses for compatibility. PMem enables transparent data persistence by allowing hardware-level durability without explicit file input/output operations, shifting from block-oriented I/O paradigms to memory-mapped interfaces that reduce overhead in data-intensive applications. In basic architecture, it employed emerging non-volatile technologies such as phase-change or , often layered with as a caching mechanism to optimize hit rates and mask higher latencies. While early commercial products like Intel's Optane DC persistent memory modules (introduced 2019, discontinued 2023) demonstrated these traits, current developments focus on disaggregated systems via standards like (CXL) to enable broader adoption. This positions PMem as a in the , offering DRAM-like speed for computation with storage-like reliability, though at somewhat reduced performance compared to pure and lower densities than disk-based systems.

Comparison to Volatile and Non-Volatile Storage

Persistent memory occupies a unique position in the computing , bridging the gap between volatile main memory, such as (), and non-volatile block-addressable storage like solid-state drives (SSDs) and hard disk drives (HDDs). DRAM offers extremely low —typically around 50-100 ns—and high bandwidth, often exceeding 50 GB/s in modern systems, but it is volatile, losing all data upon power failure, and is limited in capacity (e.g., tens to hundreds of GB per system) due to high cost and power demands. In contrast, SSDs and HDDs provide persistence, retaining data without power, and support much larger capacities (terabytes to petabytes), but suffer from higher —NVMe SSDs at 10-100 µs—and lower bandwidth for (up to ~7 GB/s sequential for high-end NVMe SSDs), as they rely on block-based I/O through the operating system and storage stack. Performance-wise, persistent memory delivered latencies of approximately 100-300 ns for random accesses in 2019 implementations—3-6 times higher than but orders of magnitude lower than SSDs—while achieving bandwidths of 10-40 /s for reads (asymmetric, with writes lower at ~10-15 /s), enabling it to support in-memory computing workloads with data durability. Emerging technologies project similar metrics (120-500 ns ) via CXL as of 2025. This positions persistent memory as a viable extension for applications requiring large datasets, such as databases and , where it can handle fine-grained, byte-addressable operations directly from user space, avoiding the overhead of block I/O. For instance, in tiered memory configurations, persistent memory acted as a larger, slower tier backed by a , blending the speed of main with storage-like capacity. In terms of , persistent memory fundamentally differs from by retaining across power cycles and system crashes without requiring explicit flushes or checkpoints, thanks to its non-volatile nature and hardware-supported atomicity for small updates. Unlike SSDs, which enforce sector-aligned (typically 4 ) writes and involve kernel-mediated , persistent memory allows direct, sub-page updates at the application level, reducing overhead and enabling crash-consistent structures. This enhances for volatile-sensitive workloads but may introduce minor limits compared to SSDs' optimized wear-leveling for large-scale writes. Economically, as of 2025, persistent memory in past implementations cost more per GB than SSDs (around $0.05-0.10/GB for enterprise SSDs, amid 10-15% price rises) but significantly less than DRAM (~$3.30-5/GB, up 170% YoY due to AI demand); historical Optane modules were ~$5-6/GB. Additionally, it offered power efficiency advantages over DRAM by eliminating refresh cycles—DRAM consumes ongoing power to maintain charge—resulting in lower standby consumption (e.g., 3-5 W per module idle) and up to 70% energy savings in certain storage workloads compared to SSD-based systems, though active power can reach 15 W under load.

History

Early Conceptual Foundations

The concept of persistent memory traces its theoretical roots to the , particularly through the development of orthogonal persistence in programming languages. Orthogonal persistence refers to a design principle where the persistence of data—its ability to survive beyond the lifetime of the creating process—is treated as an independent property, decoupled from the data's type or scope, allowing transparent state survival without explicit save operations. This idea was pioneered in the PS-algol language, where researchers aimed to enable all data objects to persist uniformly, regardless of whether they were intended for short-term or long-term use, by integrating persistence into the language's memory model from the outset. Building on these foundations, early software-focused approaches to persistence emphasized object stores that blurred the lines between and durable storage. In 1986, Satish Thatte proposed a persistent for object-oriented database systems, introducing a uniform that eliminated the traditional distinction between transient data structures in and persistent files on disk, enabling direct of persistent objects as if they were in main . Complementary techniques, such as journaling and handling "dirty" writes (uncommitted changes in buffers), emerged as key methods for ensuring in these systems by operations to recover consistent states after failures, thereby supporting reliable persistence in distributed and database environments without relying on explicit checkpoints. The transition from purely software-based persistence to hardware considerations gained momentum in the 1990s and 2000s, as proposals explored non-volatile RAM (NVRAM) to enhance performance in data-intensive applications. A notable example is the 2004 work by Pankaj Mehra and Samuel Fineberg, which advocated for NVRAM-based "fast persistence" in (HPC) and online data stores to improve and , allowing critical data to remain accessible at near-DRAM speeds even after power loss or crashes. This shift highlighted the potential of hardware to support persistence natively, reducing overhead from software-only mechanisms. A pivotal milestone in these early concepts was the recognition that true persistent memory should support load/store access—direct byte-addressable operations akin to volatile —rather than file-based methods involving system calls and buffering, which introduce and complexity. This distinction, emphasized in Thatte's architecture, enabled programmers to treat persistent and non-persistent data uniformly, paving the way for more efficient, transparent persistence models in future s.

Modern Hardware Developments

In the 2010s, the concept of storage-class memory gained momentum as a bridge between DRAM and traditional storage, with Intel and Micron announcing 3D XPoint technology in July 2015 as a non-volatile memory offering up to 1,000 times the speed and endurance of NAND flash. This technology was commercialized through Intel's Optane products, with initial shipments of Optane SSDs beginning in early 2017, targeting high-performance storage needs. Key milestones included the standardization of interfaces in 2014, which enabled non-volatile dual in-line memory modules to operate on standard buses while preserving during power loss. launched Optane DC Persistent Memory in April 2019, integrating directly as DIMMs for use, allowing systems to address terabytes of persistent memory at near-DRAM latencies. However, facing market adoption challenges and competitive pressures, discontinued Optane production in 2022, ceasing future development to refocus resources. Advancements in persistent memory emphasized seamless integration with DDR4 and DDR5 buses, enabling byte-addressable access without specialized controllers. Optane supported hybrid operating modes, including Memory Mode for transparent extension of volatile capacity and App Direct Mode for direct application control over persistent regions, facilitating fine-grained persistence. Following the Optane phase-out, industry efforts shifted toward (CXL) for disaggregating persistent memory from compute nodes, with CXL 3.0 specifications released in August 2022, followed by the CXL 3.1 specification in November 2023 and further 3.X updates, culminating in the CXL 4.0 specification on November 18, 2025, which doubles from 64 GT/s to 128 GT/s, adds support for bundled ports, and enhances memory (RAS) features to support scalable, low-latency memory pooling across fabrics. These developments were driven by demands from analytics and workloads, which require handling datasets larger than DRAM capacities while maintaining low-latency persistence for and efficiency.

Hardware Technologies

Types of Persistent Memory Devices

Persistent memory devices encompass hybrid modules and emerging non-volatile technologies designed to bridge the performance gap between volatile DRAM and traditional . These devices retain data across power cycles while offering byte-addressable access latencies closer to DRAM than to block-based . NVDIMMs represent a key hybrid category, integrating with non-volatile elements for power-loss protection. The three primary types are NVDIMM-N, NVDIMM-P, and NVDIMM-F, as standardized by . NVDIMM-N pairs with flash and a backup power source, such as a or , enabling data flushing from to flash during unexpected power outages to ensure persistence. NVDIMM-P provides direct byte-addressable access to both and persistent media, supporting larger capacities through integrated non-volatile components. NVDIMM-F operates as a block-addressable device, akin to an SSD in form, for faster storage-tier integration. These modules achieve capacities up to 512 , balancing speed and durability for server environments. Emerging non-volatile media form another core category, leveraging novel materials for native persistence without hybrid DRAM reliance. Phase-change memory (PCM) alters material states via heating to store data, offering access times in the low hundreds of nanoseconds (e.g., ~300 ns for random reads in commercial implementations like ), bridging the gap to latencies for persistent applications; for instance, PCM underlies technologies like for low-latency byte-addressability. Resistive RAM (ReRAM) switches resistance in a layer for , with exceeding 10^6 cycles and below 10 nm. (MRAM), particularly spin-transfer torque variants, uses magnetic tunnel junctions for non-volatile storage, delivering near-unlimited (over 10^12 cycles) and access times under 10 ns. Persistent memory devices adopt various form factors to integrate with existing systems. Traditional implementations use or slots for direct compatibility, enabling seamless upgrades in DDR4/DDR5 channels. Early prototypes often employed PCIe-attached cards for expanded capacity beyond socket limits. Future deployments leverage fabric-based interconnects like CXL 2.0, supporting diverse enclosures such as add-in cards (AICs) and EDSFF E3.S drives for pooled, disaggregated memory. In terms of performance metrics, persistent memory surpasses SSDs in , supporting over 100 times more write cycles (e.g., 10^8–10^12 per versus 10^3–10^4 for ), reducing wear in write-intensive workloads. Densities reached up to 512 GB per module in commercial products like Intel Optane by the early , enabling terabyte-scale systems through multi-module configurations. Developments like Optane have exemplified PCM's role in commercializing these capabilities.

Notable Implementations and Architectures

One of the most prominent implementations of persistent memory is Intel Optane Persistent Memory (PMem), based on technology, which features a cross-point array architecture enabling selectorless access to memory cells for reduced latency and improved endurance over traditional . Available in capacities ranging from 128 GB to 512 GB per dual in-line module (), Optane PMem supports two primary operating modes: Memory Mode, which extends volatile capacity transparently to the operating system without requiring software changes, and App Direct Mode (also referred to as Direct Access or mode), which provides applications with direct, byte-addressable access to the persistent for fine-grained control over data durability. This design positions Optane PMem as a bridge between and , offering latencies closer to memory while ensuring data persistence across power cycles. Hewlett Packard Enterprise (HPE) developed NVDIMM-N modules as an early hybrid persistent memory solution, integrating with for fast access and a dedicated backup power source to maintain during outages. These modules, offered in 8 GB and 16 GB capacities, operate at native DDR4 speeds of up to 2,666 MT/s and rely on the HPE Smart Storage Battery—a supercapacitor-based system—to automatically transfer data from volatile to non-volatile upon power failure, ensuring near-instant persistence without significant performance degradation. Unlike purely non-volatile designs, NVDIMM-N prioritizes -like speed for active workloads while using for durability, making it suitable for high-availability environments. Emerging persistent memory architectures in PCIe Gen4 and Gen5 interfaces incorporate enhanced error correction codes (ECC) and dynamic wear-leveling algorithms to mitigate bit errors and evenly distribute write operations across cells, extending device lifespan in demanding environments. These developments build on NVDIMM prototypes by integrating hybrid controllers that dynamically allocate data between volatile DRAM tiers for speed and non-volatile tiers for persistence, optimizing overall system efficiency. Key architectural features across these implementations include flush-on-failover mechanisms, which automatically persist cache contents to non-volatile during power interruptions to prevent , as demonstrated in whole-system persistence designs. CPU integration is facilitated by specialized instructions, such as Intel's CLFLUSHOPT, introduced in 2016 with Skylake processors, which optimizes line eviction to persistent memory by reducing overhead compared to earlier flush operations. Following Intel's discontinuation of Optane PMem production—effective for the 300 series in 2023 and winding down for the 200 series by late 2025—the industry has pivoted toward open standards like (CXL) and Gen-Z for scalable, pooled persistent memory fabrics that enable disaggregated resource sharing in . As of 2025, companies like Modular Technologies are demonstrating CXL-attached persistent memory modules, enabling pooled, high-performance PM in disaggregated systems.

Software and Programming

Programming Models and Interfaces

Programming models for persistent memory abstract the hardware's byte-addressable, non-volatile nature to enable developers to treat it as an extension of volatile memory while ensuring durability. These models typically categorize interactions based on the volatility of the source and target memory spaces. In the volatile-volatile model, persistent memory is managed transparently as additional DRAM, with the operating system or hardware handling persistence automatically, as exemplified by Intel Optane's Memory Mode (discontinued in 2023) where applications operate on a large pool of volatile memory without code changes. Emerging standards like Compute Express Link (CXL), as of 2025, extend this model to disaggregated and pooled memory environments while preserving compatibility with existing software interfaces. The volatile-persistent model employs persistent memory as a cache or tier below volatile DRAM, allowing hybrid access where hot data resides in DRAM and colder data persists directly, optimizing for performance and capacity. The persistent-persistent model provides direct byte-addressable access to persistent regions, requiring explicit durability operations but offering low-latency durability for applications like databases. For safe memory allocation in these models, techniques such as epoch-based reclamation are used to defer deallocation until all threads have advanced past a safe epoch, preventing use-after-free errors in persistent contexts without full garbage collection. Standard operating system interfaces facilitate direct access to persistent memory. On , Direct Access () enables memory-mapped I/O () on persistent memory files, bypassing the for load/store operations directly to the device, introduced in kernel versions around 2016 and supported on filesystems like and with the -o dax mount option. Persistent memory APIs, supporting block access via / and Direct Access () for byte-addressable mappings, were first introduced in and fully supported in , allowing applications to use PMem as cache or capacity drives through cmdlets like Get-PmemDisk. For distributed systems, (RDMA) over persistent memory extends protocols to enable direct remote access to non-volatile regions, using mechanisms like File Memory Regions (FileMR) to map file offsets without redundant translations, improving scalability in networked storage. The Persistent Memory Development Kit (PMDK), released by in 2018, provides a suite of libraries for higher-level programming, including support for transactional operations via libpmemobj, memory pool management with libpmempool for creating and versioning persistent pools, and tools like pmdk-convert for layout updates across PMDK versions. For low-level control, libpmem offers functions to manage persistence, such as pmem_persist() which combines cache flushing and draining to ensure data reaches the media. At the hardware level, CPU instructions ensure ordered persistence. On x86 platforms, developers use CLWB (Cache Line Write Back) to evict modified lines to the persistence domain without full invalidation, followed by SFENCE (Store Fence) to order the operation, forming the basis for higher-level like pmem_flush() in libpmem for flushing and pmem_drain() for draining buffers and ensuring writes. These instructions, part of the persistent memory extensions since Skylake processors, allow fine-grained control over durability without relying on expensive full- flushes.

Ensuring Data Persistence

Ensuring data persistence in persistent memory (PMem) systems requires explicit mechanisms to guarantee that writes are durably committed despite potential system crashes or power failures, as PMem does not automatically persist data from processor caches. Developers typically rely on low-level instructions to flush modified cache lines to PMem and enforce ordering. For instance, the x86 architecture provides the CLWB (Cache Line Write Back) instruction to write back a specific cache line to PMem without immediate eviction, which is more efficient than older CLFLUSH instructions by allowing the line to remain in the cache for potential reuse. This flush must be followed by a store fence, such as SFENCE, to ensure that all preceding stores are visible to subsequent operations before any further writes occur, preventing reordering by the processor that could lead to inconsistent states after a crash. Together, these operations—often termed "persist barriers"—provide the foundation for durability, though they introduce performance overhead due to the latency of flushing and fencing, which can be mitigated by batching multiple updates before a single barrier. To achieve atomicity across multiple objects, persistent memory systems adapt () techniques, extending traditional in-memory to handle durability. In persistent , transactions use protocols to ensure : logs record pre-update values for on failure, while redo logs capture post-update states for application during recovery, enabling multi-object updates without support. These logs are persisted using flush-and-fence operations at commit, allowing systems to replay or revert operations post-crash while maintaining . Libraries like PMDK provide interfaces that incorporate such transactional support, simplifying implementation for developers. Failure-atomicity techniques address the risk of partial updates during crashes, ensuring operations complete entirely or not at all from the recovery perspective. , a form of , creates a full duplicate of the before modification; updates proceed on the , and a pointer swap makes it visible only upon successful , reverting to the original on . similarly maintains dual versions of data, updating the shadow copy in a failure-atomic manner—often via slotted structures that persist in fixed-size units—and switching pointers post-commit, which avoids in-place mutations that could leave data in inconsistent states. These methods, while incurring space and copy overhead, provide strong guarantees for complex updates like modifications. At the filesystem level, PMem-optimized designs integrate persistence mechanisms to ensure both data and durability with minimal overhead. , presented in 2016, employs a log-structured approach with lightweight journaling for , where updates are appended to a persistent log and failure-atomicity is achieved through ordered flushes that commit journal entries before data, enabling fast recovery via log replay without full scans. This contrasts with traditional disk filesystems by leveraging PMem's byte-addressability for direct access, reducing synchronization costs while maintaining compliance.

Challenges

The Read-of-Non-Persistent-Write Problem

In lock-free programs utilizing (PMem), the read-of-non-persistent-write problem arises when a successful () operation updates a cache line in , making the change visible to other threads immediately, but fails to persist it to PMem before a crash occurs. This visibility discrepancy leads to inconsistent states during recovery, as the update appears in but partial or absent on the persistent medium, violating guarantees. The issue stems from the separation of atomicity in volatile memory from persistence operations, such as cache flushes and synchronization fences, which are non-atomic and susceptible to interruptions like power failures. A classic example occurs in lock-free linked lists, where a successfully links a new node but the flush to PMem is delayed or interrupted; post-crash may then reveal orphaned nodes, as subsequent pointers referencing the unpersisted update become invalid, resulting in or structural corruption. Similarly, in lock-free queues or circular buffers, an enqueue operation might update the pointer in , allowing other threads to read and dequeue from the "new" position, but a crash could leave the buffer in a state where enqueued items are not recovered, appearing as if the update never occurred despite runtime observations. These scenarios highlight how write-after-read dependencies in concurrent algorithms exacerbate the problem, as reads depend on unconfirmed persistence. This concurrency flaw was first systematically highlighted in the context of durable lock-free data structures for , building on earlier concepts of durable introduced around 2016. Prior adaptations of lock-free algorithms to PMem had overlooked the precise ordering required between volatile atomics and persistence barriers, leading to subtle recovery inconsistencies. Mitigations typically involve enforcing strict persistence ordering through explicit flushes and fences after each , though this incurs significant performance overhead due to frequent memory barriers. Software approaches, such as augmenting structures with versioned to track status, add validation checks that can increase overhead by up to 10% in read-heavy workloads. Hardware proposals, like persistent instructions (e.g., PCAS in extended architectures), integrate flushing atomically with the swap to eliminate the gap, though adoption remains limited. Alternative transformations, such as maintaining dual volatile and persistent replicas with sequenced updates, further reduce exposure but trade off some complexity for improved throughput, as seen in persistent queues outperforming baseline methods by over 4x in mixed workloads.

Consistency and Reliability Issues

Persistent memory systems face significant challenges in maintaining consistency due to the weak ordering guarantees provided by modern processor architectures. Traditional total store order (TSO) models, common in x86 processors, ensure a total order for volatile memory stores but do not extend this ordering to persistence operations, leading to potential delays in data durability. Store buffers in processors can hold writes temporarily before committing them to persistent media, which may result in non-deterministic persistence ordering across threads or cores, complicating the enforcement of failure-atomic updates. To address this, extensions like persistent TSO (PTSO) integrate buffered epoch persistency with TSO, providing intuitive semantics where persistence is tied to epochs that synchronize with store buffer drains, ensuring that stores become persistent in program order without requiring explicit flushes for every operation. Reliability in persistent memory is further compromised by hardware limitations inherent to non-volatile technologies. Wear-out from limited write endurance—typically 10^6 to 10^9 cycles per cell in phase-change memory (PCM) versus unlimited in —necessitates wear-leveling techniques to distribute writes evenly across cells, preventing premature failure in hot spots. Start-gap wear-leveling, for instance, swaps frequently accessed lines with less-used ones to extend device lifetime from about 5% to 97% of the theoretical maximum (approximately 19 times improvement over naive schemes). Bit error rates in persistent media, such as PCM, can be orders of magnitude higher (e.g., 10^{-4} raw errors) than in DRAM due to resistance drift and process variations, requiring advanced error-correcting codes () like multi-bit ECC schemes that correct up to 4 errors per 512 bits while maintaining low overhead. Power-loss semantics add another layer of complexity, as abrupt failures can leave data in partial states across the persistence domain; strict persistency models mandate immediate on store, but relaxed models like epoch-based persistence batch updates for efficiency, relying on failure-atomic commit points to avoid . Crash consistency requires mechanisms to ensure idempotent recovery, where post-failure restarts either complete or fully roll back operations without side effects. This is challenged by multi-level failure modes spanning the (where dirty lines may not flush), the (which handles asynchronous persistence), and the media itself (prone to partial writes). For example, asynchronous refresh () in formerly available Optane systems guarantees flushing of cache lines to the controller on power loss, but controller-to-media transfers remain vulnerable, necessitating software logging or redo protocols to achieve . Techniques like failure-atomic multi-section sync (FAMS) ensure that grouped persistence operations complete atomically, mitigating risks from these layered s by verifying on . Security issues in persistent memory arise from its byte-addressable nature and integration with volatile caches, exposing to side-channel risks even after power-off. Persistent caches, such as those in formerly available Optane DC PMem, can leak timing information through internal buffering, enabling attacks that infer patterns via access latencies. Off-chip side-channel attacks via interfaces further amplify threats by observing power or timing variations during persistence. To counter these, for is essential, often implemented at the hardware level with engines in controllers, ensuring with some overhead. Following the discontinuation of Optane in 2022, ongoing research as of 2025 focuses on emerging technologies like CXL-based disaggregated persistent memory and improved STT-MRAM or ReRAM with higher endurance to address these challenges in .

Applications

Current Use Cases

Persistent memory has been deployed in database systems to enable larger in-memory datasets without relying on slower . For instance, leveraged Optane DC Persistent Memory (prior to its discontinuation in 2023, with final shipments ending in late 2025) to store main data fragments directly in non-volatile RAM, eliminating the need for initial data loading upon restarts and supporting up to three times the memory capacity of traditional configurations at a lower cost per . This allowed to handle massive datasets—such as those exceeding several terabytes—entirely in memory, avoiding spills to SSDs and improving query performance for enterprise applications, though current deployments rely on legacy systems or await alternatives like CXL-based solutions. Similarly, extensions like pmem-redis integrate persistent memory support into , enabling byte-addressable persistence for key-value stores while maintaining low-latency access, though adoption remains more experimental compared to production systems like . In and AI workloads, persistent memory accelerates (OLAP) queries and training by providing fast, durable storage for intermediate results and checkpoints. For OLAP, systems optimized for persistent memory, such as those using PMem for columnar data scans, achieve query runtimes with only a 1.66x slowdown compared to pure for read-heavy workloads on large datasets, making it viable for terabyte-scale analytics without frequent disk I/O. In , persistent memory facilitates frequent checkpointing during training; for example, frameworks like PCcheck use PMem to persist model states concurrently, reducing recovery time from failures to seconds and enabling checkpoints every 10 iterations in distributed setups. extensions, such as those built with the Persistent Memory Development Kit (PMDK), optimized shuffle operations and caching for PMem in the late and early , speeding up analytics pipelines by up to 2x in memory-bound tasks. As of 2025, emerging (CXL) 3.0 implementations are beginning to enable persistent memory pooling in and environments, with initial deployments in data centers supporting dynamic allocation for and OLAP workloads to address DRAM limitations post-Optane. Virtualization environments benefit from persistent memory through faster virtual machine (VM) operations, particularly snapshots and migrations. In VMware vSphere, virtual persistent memory (vPMem) exposes NVDIMMs to guest OSes, allowing VMs to use byte-addressable PMem for applications requiring durability; this reduces snapshot times by persisting memory state directly, avoiding the overhead of flushing volatile RAM to disk, and supports up to 6 TB of PMem per host in 2-socket systems for high-performance workloads, allocatable to VMs. In high-performance computing (HPC), NVDIMMs enable fault-tolerant simulations by providing in-situ checkpointing; for example, algorithm-directed consistency in PMem allows HPC applications to recover from node failures with minimal data loss, improving simulation uptime in large-scale runs like climate modeling or molecular dynamics. Real-world adoption of persistent memory includes deployments and latency-sensitive sectors. Optane PMem was integrated into AWS EC2 instances from 2019 to 2022, powering high-memory r5b instances for in-memory databases and , where it delivered up to 4 TB of capacity per node with for fault . In systems, persistent memory supports by ensuring low-latency durability; platforms use PMem to process and persist transaction logs in microseconds, reducing times from crashes and enabling real-time on millions of records without disk latency.

Future Prospects

The discontinuation of Intel Optane has spurred research into alternative non-volatile memory technologies, such as resistive RAM (ReRAM), which is projected to see significant market growth—from USD 0.63 billion in 2025 at a CAGR of around 20%—and could enable a revival of persistent memory systems by providing cheaper, high-density media suitable for byte-addressable storage. Emerging storage-class memories from vendors like and Everspin are also emerging as viable post-Optane options, offering performance levels that could match or exceed in while maintaining . By 2026, advancements in these media are expected to address endurance and scalability challenges, potentially integrating into production systems for broader adoption. Compute Express Link (CXL) 3.0, released in specifications in 2022 and gaining traction in 2025 implementations, facilitates memory pooling in data centers by enabling cache-coherent sharing of persistent across servers, reducing silos and supporting dynamic allocation for large-scale workloads. This technology, with peer-to-peer access and multi-tier switching, is poised for widespread deployment in hyperscale environments by late 2025 and beyond, enhancing efficiency in disaggregated architectures. In disaggregated computing, persistent memory can be separated from compute nodes and accessed remotely, allowing for elastic resource allocation and cost savings in data centers through models like passive disaggregated persistent memory (pDPM). This approach enables remote control of memory pools via network-attached devices, bypassing traditional server-bound limitations and supporting scalable applications. For workloads, persistent memory integration via CXL pooling addresses shortages by providing persistent storage for intermediate states, such as tensors during , potentially accelerating cycles in memory-intensive models without full restarts. Adoption of persistent memory faces hurdles in cost and , with current pricing for high-capacity modules exceeding economical thresholds for widespread use; reductions to under $0.10 per GB are targeted through scaling production of alternatives like ReRAM to compete with . efforts, particularly through CXL protocols, are critical for ensuring seamless integration across heterogeneous hardware, promoting vendor-agnostic pooling and reducing deployment barriers in multi-vendor data centers. Ongoing research explores quantum-resistant designs for persistent memory to safeguard data against future threats, potentially incorporating post-quantum directly into memory controllers for secure persistence. Integration with neuromorphic hardware represents another direction, where non-volatile memories like memristors enable energy-efficient, brain-inspired architectures with in-situ learning and persistent synaptic weights, addressing the memory wall in .