Fact-checked by Grok 2 weeks ago

Memory virtualization

Memory virtualization is a fundamental technique in computing systems that enables a hypervisor or virtual machine monitor (VMM) to abstract the host's physical memory, presenting each virtual machine (VM) with an illusion of contiguous, dedicated physical memory while allowing for dynamic allocation, sharing, and overcommitment of resources across multiple VMs without interference from the guest operating systems.^[1] This abstraction decouples VM memory demands from the underlying hardware, facilitating efficient resource utilization in virtualized environments such as cloud computing and data centers.^[2] At its core, memory virtualization operates through address translation mechanisms that map guest virtual addresses to host physical addresses. In software-based approaches, the VMM maintains shadow page tables to translate guest physical addresses (treated as "physical" by the VM but virtualized by the host) directly to machine addresses, requiring the VMM to intercept and emulate page table updates for consistency.^[3] Hardware-assisted methods, introduced in processors like Intel's Extended Page Tables (EPT) and AMD's Nested Page Tables (NPT) since the late 2000s, add a second level of translation hardware-managed by the VMM, reducing software overhead but potentially increasing translation latency on TLB misses.^[1] These techniques ensure isolation between VMs, preventing one VM's memory access from affecting others, while supporting features like large pages (e.g., 2 MB or 1 GB) to minimize translation overhead.^[3] To manage memory overcommitment—where the total VM memory exceeds available host RAM—hypervisors employ reclamation strategies such as ballooning, where a driver in the guest OS inflates a "balloon" to induce the guest to free low-value pages for host reuse, and transparent page sharing, which identifies and deduplicates identical memory pages across VMs based on content hashing.^[2] Additional optimizations include memory compression and swapping to disk, enabling high VM density with minimal performance impact; for instance, ballooning incurs overhead as low as 1.4% under moderate loads.^[2] These methods, pioneered in systems like VMware ESX Server in the early 2000s, have become standard for achieving performance isolation and resource efficiency in production environments.^[2] Overall, memory virtualization enhances scalability and cost-effectiveness in virtualized infrastructures by supporting dynamic resource partitioning, but it introduces challenges like policy conflicts between guest and host memory managers and the need for hardware support to mitigate virtualization overhead.^[3] Its evolution continues with advancements in processor features, including 5-level paging and extended page tables since 2017, as well as integration with confidential computing technologies like AMD SEV-SNP (since 2021) and Intel TDX for secure memory isolation.^[1]^[4]

Historical Background

Origins in Virtual Memory

Virtual memory emerged in the 1960s as a foundational technique in computing, pioneered by teams at the University of Manchester for the Atlas computer and by collaborators from MIT, General Electric, and AT&T Bell Labs for the Multics operating system.^[5] This innovation enabled programs to operate as if they had access to a much larger contiguous memory space than the physical hardware provided, by automatically transferring inactive portions of a program's memory—known as pages—to secondary storage like magnetic drums or disks, and retrieving them on demand. The Atlas system, completed in 1962 with its paging mechanism operational in 1962, represented the first practical implementation of this approach, using a hardware page table to map logical addresses to physical locations and employing a least-recently-used algorithm for page replacement.^[6] Key milestones in the early adoption of virtual memory included the 1966 introduction of the IBM System/360 Model 67, the first commercial mainframe from IBM to support demand paging as a standard feature.^[7] This system extended the base System/360 architecture with dynamic address translation hardware, allowing pages to be loaded into main memory only when referenced, thus optimizing resource use in time-sharing environments.^[8] By the 1970s, virtual memory was integrated into the evolving Unix operating system at AT&T Bell Labs, where it facilitated process isolation by assigning each process its own independent logical address space, preventing interference while supporting multitasking on limited hardware like the PDP-11 minicomputers.^[5] These developments built on Multics' segmented virtual memory model, introduced around 1965, which divided programs into named segments for modular sharing and protection, further refining the technique for multi-user systems.^[9] At its core, virtual memory abstracted the constraints of physical memory by decoupling logical address spaces—visible to programs—from the actual hardware layout, thereby enabling efficient multiprogramming on mainframes where multiple jobs could run concurrently without manual memory management.^[10] This separation allowed systems to handle workloads far exceeding installed RAM capacity through transparent swapping, dramatically improving utilization and scalability in single-machine environments.^[6] Such principles provided the conceptual groundwork for later memory virtualization techniques, though they remained confined to intra-system abstraction rather than cross-machine resource pooling.^[5]

Evolution in Data Centers and Cloud Computing

In the late 1990s and early 2000s, the advent of server virtualization marked a significant shift in memory management practices within data centers. VMware, founded in 1998 and launching its first product in 1999, pioneered hypervisor-based virtualization that enabled memory overcommitment, allowing the total virtual machine (VM) memory allocation to exceed the physical host's RAM capacity through techniques like transparent page sharing and ballooning.^[11]^[12] This approach improved resource utilization on individual servers but remained confined to host boundaries, limiting scalability in multi-node environments.^[12] The 2010s saw the emergence of disaggregated memory architectures, driven by major cloud providers such as AWS and Google to address persistent underutilization of RAM in traditional servers, where memory often remained idle due to overprovisioning for peak loads.^[13]^[14] These architectures decoupled compute and memory resources, enabling pooled access across nodes via high-speed networks. A pivotal development was 2011 research demonstrating efficient remote memory access using Remote Direct Memory Access (RDMA) over InfiniBand, which facilitated low-latency data transfers and laid groundwork for cluster-wide memory sharing.^[15] In 2019, Intel and a consortium of partners introduced the Compute Express Link (CXL) standard, providing a cache-coherent interconnect for memory pooling that extended beyond local hosts.^[16] By 2020, hyperscalers like Google and AWS achieved significant improvements in resource utilization through disaggregation and pooling, addressing underutilization rates of 40-60% and projecting up to 25% reductions in total cost of ownership, which reduced hardware overprovisioning and costs in large-scale exascale computing deployments.^[17] These gains stemmed from better utilization of idle memory across clusters, minimizing waste in data centers.^[13] In the 2020s, advancements integrated persistent memory technologies, such as Intel Optane (discontinued in 2022), into disaggregated pools to create non-volatile shared resources that retained data across power cycles, enhancing reliability for cloud workloads, with research shifting to alternatives like CXL-based persistent memory.^[18]^[19]^[20] This evolution built on early virtual memory concepts by extending them to networked, resilient systems.^[13]

Core Principles

Definition and Overview

Memory virtualization is a technique in computing systems that enables a hypervisor or virtual machine monitor (VMM) to abstract the host's physical memory, presenting each virtual machine (VM) with an illusion of contiguous, dedicated physical memory. This abstraction allows for dynamic allocation, sharing, and overcommitment of memory resources across multiple VMs without interference from the guest operating systems.^[1] It decouples VM memory demands from the underlying hardware, improving resource utilization in virtualized environments like cloud computing and data centers.^[2] The primary purpose is to provide memory isolation and efficient sharing among VMs on a single host, addressing the challenges of running multiple guest OSes that each assume direct access to physical memory. Unlike traditional virtual memory, which operates within a single OS to manage address spaces via paging to disk, memory virtualization adds a layer of indirection for guest physical addresses, treating them as virtual from the host's perspective.^[3] This enables features like overcommitment, where the sum of VM memory allocations exceeds host physical RAM, optimizing utilization rates that can otherwise be low in dedicated server setups. Key benefits include enhanced scalability and cost-effectiveness through resource pooling and dynamic repartitioning, supporting high VM density with minimal performance degradation. For example, techniques like large page support (2 MB or 1 GB) reduce translation overhead. However, it introduces overhead from additional address translations and potential policy conflicts between guest and host memory managers.^[1]

Key Mechanisms and Components

Memory virtualization relies on address translation to map guest virtual addresses through guest physical addresses to host physical (machine) addresses, ensuring isolation and correct access. In software-based implementations, the VMM maintains shadow page tables that mirror the guest's page tables but translate directly to machine addresses; the VMM intercepts guest page table updates to keep shadows consistent, though this can incur significant overhead from emulation.^[3] Hardware-assisted approaches, available since the late 2000s in processors like Intel's VT-x with Extended Page Tables (EPT) and AMD's Secure Virtual Machine with Nested Page Tables (NPT), introduce a second-level translation managed by hardware. The guest physical address is translated to machine address via EPT/NPT structures populated by the VMM, reducing software involvement and traps on TLB misses, though it may increase latency on certain cache misses. These methods support VM isolation by preventing cross-VM memory access and enable optimizations like large pages to minimize page table walks.^[1] To handle overcommitment, hypervisors use reclamation mechanisms such as ballooning, where a guest driver allocates ("inflates") a balloon of pages to pressure the guest OS into freeing low-priority memory for host reuse, and transparent page sharing, which deduplicates identical pages across VMs using content-based hashing (e.g., via CRC). Additional strategies include memory compression to avoid swapping to disk and demand-paging from storage, achieving overheads as low as 1-2% in moderate workloads. These components, integral to systems like VMware ESX since the early 2000s, ensure performance isolation and efficient resource use.^[2]

Implementation Approaches

Application-Level Integration

Application-level integration in memory virtualization allows applications to directly interact with shared or remote memory pools using user-space libraries and APIs, circumventing traditional kernel-mediated access to achieve reduced latency and greater control over resource allocation. This approach enables developers to implement custom memory access patterns tailored to specific workloads, such as disaggregated computing environments where memory is pooled across nodes in a cluster. By operating in user space, applications can request and manage memory allocations without invoking the operating system kernel for each operation, which minimizes overhead and supports high-performance scenarios like real-time data processing. Key techniques in this integration include the use of memory-mapped files over network file systems enhanced with remote direct memory access (RDMA) capabilities, which allow applications to map remote memory regions directly into their address space for efficient data sharing. Another prominent method involves direct API calls through libraries such as libpmem, which facilitate pooling and management of persistent memory resources across distributed systems, enabling applications to treat non-volatile memory as a unified pool despite its physical distribution. These techniques leverage asynchronous I/O operations to overlap computation with data transfers, ensuring that remote memory access latencies in the range of 1-10 microseconds do not bottleneck application performance, while supporting throughputs up to 100 GB/s in RoCEv2-based networks. Practical examples illustrate the versatility of application-level integration. In-memory databases like Redis have been extended with remote memory support through user-space drivers that enable querying and caching data from pooled memory across cluster nodes, improving scalability for large-scale deployments without altering the core database engine. These integrations highlight how application-level approaches empower domain-specific optimizations, such as bursty memory demands in analytics workloads. Despite these advantages, application-level integration introduces challenges related to programming model complexity, as developers must explicitly manage remote faults, such as node failures or network partitions, which can lead to data inconsistencies if not handled through robust error-recovery mechanisms in the API layer. This requires applications to incorporate fault-tolerant designs, like replication or checkpointing, directly into their logic, increasing development effort compared to transparent virtualization methods.

Operating System-Level Integration

Operating systems achieve memory virtualization at the kernel level by modifying the virtual memory subsystem to incorporate remote memory pages directly into the local address space, enabling seamless extension of physical resources without application awareness. This integration typically involves hooking into page fault handlers within the memory management unit (MMU) to detect and resolve accesses to remote pages, treating them as part of the unified virtual address space managed by the kernel's virtual memory manager (VMM). Key techniques include extending the swap space mechanism to encompass remote memory pools, where idle or evicted pages are paged out to networked DRAM instead of local disk, and integrating remote access with the page cache for on-demand fetching. For instance, remote paging systems leverage the kernel's swap subsystem to map portions of swap space to remote locations, using efficient network protocols like RDMA for low-latency transfers, while the page cache handles caching and prefetching to minimize repeated remote fetches.^[21] Prominent examples include Linux kernel modifications for RDMA-based remote memory access, such as the InfiniSwap project, which implements a virtual block device that interfaces with the VMM to distribute swap slabs across remote machines' memory. In Windows Server environments, Hyper-V supports memory overcommitment through Dynamic Memory, which dynamically allocates and reclaims portions of the host's physical memory among VMs, with paging to the host's local storage if physical memory is insufficient. These implementations route MMU-generated page faults over the network via modified trap handlers, enabling demand-paging from remote pools; for example, the InfiniSwap system achieves up to 97% of local memory throughput for certain workloads like Memcached, with performance depending on network latency.^[22] Security in OS-level memory virtualization emphasizes encryption of remote traffic, such as using IPsec to protect page data in transit against interception, alongside isolation mechanisms like Linux namespaces to segregate virtual memory accesses and prevent cross-tenant information leaks in multi-tenant setups.^[23]^[24]

Technologies and Products

Commercial Solutions

Several commercial solutions have emerged to implement memory virtualization, focusing on disaggregation and pooling to optimize resource utilization in enterprise and cloud environments. These products leverage hardware accelerations like DPUs and fabrics to enable dynamic memory allocation across clusters, supporting demanding workloads such as AI and high-performance computing (HPC). VMware's vSphere and ESXi platforms, updated since 2021 through initiatives like Project Capitola, transform the ESXi hypervisor into a disaggregated memory pooling and aggregation system. This approach aggregates DRAM and persistent memory (PMEM) within nodes, with DPU offload via Project Monterey enabling composability across PCIe or CXL fabrics for future rack-scale extensions, while integrating with vSAN for complementary storage disaggregation.^[25] Microsoft's Azure Stack HCI, built on Hyper-V, facilitates memory pooling through Dynamic Memory allocation that adjusts VM resources based on demand, combined with RDMA networking for low-latency access in hybrid cloud setups, thereby supporting higher VM densities compared to traditional configurations. As of 2025, Azure integrates Compute Express Link (CXL) support for enhanced disaggregated memory access in virtualized environments.^[26]^[27]^[28] GigaIO's FabreX, introduced in the 2020s, delivers a PCIe-based fabric for memory disaggregation, pooling resources across servers with sub-100ns non-blocking switch latencies to maintain performance in enterprise deployments.^[29] Hewlett Packard Enterprise (HPE) Synergy and Dell's PowerEdge MX platforms provide composable infrastructure with dedicated memory blades, allowing dynamic allocation of compute, storage, and memory resources to adapt to HPC workloads efficiently.^[30]^[31] By 2025, memory virtualization adoption in data centers has accelerated, with technologies addressing AI-driven needs such as NVIDIA GPU memory extensions via NVLink pooling to overcome capacity limitations in training workloads.^[32]^[28]

Research and Emerging Technologies

Research in memory virtualization has focused on disaggregating memory resources to improve utilization and scalability in large-scale systems. One notable project is InfiniSwap, introduced in 2017, which implements a swap-based mechanism for remote memory access in Linux environments using RDMA networks. This approach enables efficient memory disaggregation by treating remote memory as a decentralized swap space, reducing the overhead of traditional paging while supporting high-throughput workloads without requiring application modifications.^[33] At the University of California, San Diego, researchers have explored OS-level modifications for intra-node disaggregation, as demonstrated in the Clio system from 2022. Clio combines hardware and software designs to enable fine-grained memory sharing within nodes, leveraging RDMA and custom page table management to minimize latency and improve resource elasticity in heterogeneous environments.^[34] Emerging technologies are advancing multi-host memory sharing through standardized interconnects. The Compute Express Link (CXL) 3.0 specification, released in 2022 with ongoing enhancements into 2024, introduces fabric-level coherency protocols that allow multiple hosts to share device-attached memory pools with low-latency access, supporting up to terabyte-scale disaggregation while maintaining cache coherence across systems.^[35] A key prototype in persistent memory disaggregation is the passive disaggregated persistent memory (pDPM) system presented at USENIX ATC 2020, which separates control and data planes to enable remote non-volatile memory (NVM) access from compute servers. This design uses RDMA for direct memory operations on disaggregated PM, achieving sub-microsecond latencies for key-value store operations in edge computing scenarios by minimizing host-side processing.^[13] Future directions in memory virtualization emphasize optimization for disaggregated environments. AI-optimized tiering algorithms, like those in the GPAC framework for virtual machines, leverage machine learning to predict access patterns and reduce near-memory usage by 50-70%, thereby lowering migration overhead and improving performance in tiered virtualization setups.^[36] Challenges persist in scaling to petabyte-scale memory pools, where simulations indicate potential for significant cost reductions through efficient virtualization but highlight increased power consumption due to interconnect demands and data movement.^[37]

References

[1]
Memory Virtualization with vSphere - TechDocs
A host performs virtual memory management without the knowledge of the guest operating system and without interfering with the guest operating system's own ...
[2]
[PDF] Memory Resource Management in VMware ESX Server - USENIX
VMware ESX Server uses ballooning, idle memory tax, content-based page sharing, and hot I/O page remapping to manage memory efficiently.
[3]
[PDF] Virtualizing Memory - Yiying Zhang
Outline. • Software-based memory virtualization. • Hardware-assisted memory ... One memory access from the guest VM may lead up to 20 memory accesses!
[4]
[PDF] BEFORE MEMORY WAS VIRTUAL - the denning institute
Nov 1, 1996 · The story of virtual memory, from the Atlas Computer at the University of Manchester in the 1950s to the multicomputers and World. Wide Web on ...Missing: origins | Show results with:origins
[5]
The Atlas Milestone - Communications of the ACM
Sep 1, 2022 · Celebrating virtual memory, which has made such a difference in how we approach programming, memory management, and secure computing.
[6]
Virtualization's Past Helps Explain Its Current Importance
Feb 6, 2017 · The Model 67 also had hardware to reflect reference and/or change of a real memory frame. The Model 67 augmented the 5-bit storage key of the ...
[7]
[PDF] System/360 Model 67 Time Sharing System Preliminary Technical ...
The System/360 Model 6'7 Technical Sum.mary is a self- contained description of the system, its components, and the Time-Sharing System programming support.
[8]
The Multics virtual memory: concepts and design - ACM Digital Library
Multics provides direct hardware addressing by user and system programs of all information, independent of its physical storage location.
[9]
Two Manchester Computer Milestones | IEEE Journals & Magazine
Sep 2, 2022 · The second plaque marked the Atlas computer and the invention of virtual memory, 1957– 1962. Both projects benefitted from a significant long- ...
[10]
Everything you need to know about VMware | IT Pro - ITPro
Dec 21, 2024 · VMware, established in 1998, has been a leader in virtualization and cloud infrastructure solutions for over 25 years. Its core offerings ...
[11]
[PDF] Understanding Memory Resource Management in VMware® ESX ...
Memory overcommitment allows the hypervisor to use memory reclamation techniques to take the inactive or unused host physical memory away from the idle virtual ...
[12]
[PDF] Disaggregating Persistent Memory and Controlling Them Remotely
In this paper, we explore the design of disaggregating. PM and managing them remotely from compute servers, a model we call passive disaggregated persistent ...
[13]
[PDF] Disaggregated Memory Architectures for Blade Servers
Figure 2: Intra- and inter-server variations in memory utilization. (a) The amount of granted memory for TPC-H queries can vary by orders of magnitude.
[14]
Memcached Design on High Performance RDMA Capable ...
Oct 17, 2011 · We provide a detailed performance comparison of our Memcached design compared to unmodified Memcached using Sockets over RDMA and 10 Gigabit ...
[15]
Understanding the Compute Express Link Standard | Synopsys IP
Jul 22, 2019 · Compute Express Link (CXL) is an open interconnect standard for intensive workloads, providing low latency memory access and coherent caching ...
[16]
[PDF] Memory Disaggregation: Advances and Open Challenges - arXiv
May 6, 2023 · Memory disaggregation exposes remote memory as a pool, shared across multiple servers, decoupling compute and memory resources.Missing: gains | Show results with:gains
[17]
Adoption of Intel Optane persistent memory picks up in 2020
Oct 30, 2020 · The use of Optane persistent memory is starting to pick up -- although not in all the places Intel originally thought it would.
[18]
[PDF] Farview: Disaggregated Memory with Operator Off-loading for ...
Farview is a disaggregated memory solution for databases, using network-attached DRAM as a remote buffer cache with operator offloading, decoupling memory from ...
[19]
[PDF] Memory Disaggregation: Why Now and What Are the Challenges?
May 27, 2025 · Memory disaggregation: the memory moves from the confines of a host into a memory pool, which can then be accessed by multiple servers.
[20]
How server disaggregation could make cloud data centers more ...
Mar 27, 2018 · According to Reale, about 16 percent of CPU and 30 percent of memory resources in a typical datacenter may be wasted this way. But what if ...Missing: average utilization
[21]
[PDF] DisaggRec: Architecting Disaggregated Systems for Large-Scale ...
Dec 2, 2022 · Resource disaggregation decouples the deployment of com- pute and memory, allowing system architects to provision and scale resource types ...
[22]
[PDF] Memory Disaggregation: Advances and Open Challenges - NSF PAR
Compute and memory are tightly coupled within each server in traditional datacenters. Large-scale datacenter operators have identified this coupling as a root ...<|control11|><|separator|>
[23]
[PDF] Memory Disaggregation: Open Challenges in the Era of CXL
CXL adds around 50-100 nanosec- onds of extra latency over normal DRAM access. CXL Roadmap. Today, CXL-enabled CPUs and memory devices support CXL 1.0/1.1 that ...<|control11|><|separator|>
[24]
[PDF] Managing Memory Tiers with CXL in Virtualized Environments
Jul 10, 2024 · The paper proposes hardware-managed tiering for CXL, using Intel's Flat Memory Mode, which manages data placement at cache-line granularity, ...Missing: components PMem
[25]
[PDF] Clio: A Hardware-Software Co-Designed Disaggregated Memory ...
We built Clio, a disaggregated memory system that virtualizes, protects, and manages disaggregated memory at hardware-based memory nodes. The Clio hardware ...
[26]
[PDF] UniMem: Redesigning Disaggregated Memory within A Unified ...
Jul 12, 2024 · Disaggregated memory (DM) has been proposed as a feasi- ble solution towards scaling memory capacity. A variety of memory disaggregation ...
[27]
[PDF] Understanding RDMA Microarchitecture Resources for Performance ...
Apr 17, 2023 · The goal of network virtualization is to achieve low overhead [19, 31, 57]. In comparison, network performance isolation focuses on how to ...
[28]
None
### Summary of CXL.mem Mechanisms, Pool Sizes, and Performance Models from arXiv:2303.06153
[29]
[PDF] Disaggregated Memory for Expansion and Sharing in Blade Servers
DRAM capacity corresponding to only 75% of the benchmark's memory footprint, but have the ability to exploit capacity from a remote memory blade. We assume ...
[30]
[PDF] A Transparent Remote Paging Model for Virtual Machines
The VMM main- tains a fast cache on remote machines, called memory servers. A disk request that hits a cached block is di- rectly satisfied from the remote ...
[31]
Hyper-V Dynamic Memory (aka OverCommit) Configuration Guide
Nov 26, 2012 · Before you can get Dynamic memory, you must upgrade the integration components. Four new virtual machine settings in Hyper-V Manager allow ...
[32]
How much overhead does x86/x64 virtualization have? - Server Fault
Apr 20, 2011 · For Linux guests, CPU/memory overhead is 14.36%, network I/O 24.46%, disk I/O 8.84%. For Windows, CPU/memory is 13.06%, network I/O 35.27%, ...
[33]
[PDF] Guide to IPsec VPNs - NIST Technical Series Publications
Jun 1, 2020 · The Special Publication 800-series reports on ITL's research, guidelines, and outreach efforts in information system security, and its ...
[34]
Analysis of Security in OS-Level Virtualization - arXiv
Jan 2, 2025 · For this paper, we will be looking at the Docker Containers to understand the isolation and security in the OS-level virtualization.
[35]
VMware Stretches ESXi To Be A Disaggregated Memory Hypervisor
Oct 11, 2021 · The ESXi hypervisor is turned into a memory pooling and aggregation stack. In essence, the transparent storage tiering that has been part of the ESXi ...
[36]
Azure Stack Hub compute capacity - Microsoft Learn
Jun 20, 2025 · The virtual machine (VM) sizes supported on Azure Stack Hub are a subset of those supported on Azure. Azure imposes resource limits along ...
[37]
[PDF] Microsoft Azure Stack - Networking
The result is larger numbers of hosted VMs per physical servers, more VDI instances and SQL environments achieve high-performance to complete queries faster.
[38]
[PDF] Fabric-Centric Computing - acm sigops
Jun 22, 2023 · For example, the FabreX. PCIe switch [8] delivers less than 100ns non-blocking switch latency per port with up to 512Gbits/s bandwidth. When ac-.
[39]
Synergy Infrastructure | HPE
Simplify hybrid cloud management with HPE Synergy platform, a composable infrastructure solution for managing compute, storage and networking.Simplify, Automate, And... · Our Composable Ecosystem... · Planning And Workload...Missing: Dell dynamic HPC
[40]
Dell EMC Takes On HPE Synergy With 'Breakthrough' Composable ...
Aug 21, 2018 · Dell EMC is going to battle against Hewlett Packard Enterprise in the modular infrastructure arena by launching its groundbreaking PowerEdge MX.Missing: dynamic HPC
[41]
https://news-blogs.cisco.com/apjc/2025/11/06/cisco-nvidia-gtc-what-the-latest-ai-announcements-mean-for-enterprises/
[42]
Data center semiconductor trends 2025: Artificial Intelligence ...
Aug 12, 2025 · GPUs remain the cornerstone of AI infrastructure, with Nvidia capturing 93% of the server GPU revenue in 2024. Yole Group, the market ...
[43]
[PDF] Efficient Memory Disaggregation with INFINISWAP - USENIX
Mar 27, 2017 · Using INFINISWAP, throughputs of these applications improve between 4× (0.94×) to 15.4× (7.8×) over disk. (Mellanox nbdX), and median and tail ...<|control11|><|separator|>
[44]
[PDF] A Hardware-Software Co-Designed Disaggregated Memory System
Mar 4, 2022 · Memory disaggregation has attracted great attention recently be- cause of its benefits in efficient memory utilization and ease of management.Missing: modifications | Show results with:modifications
[45]
[PDF] Gen-Z: A Memory-Centric Interconnect Fabric | Hyperion Research
Feb 1, 2019 · This fabric is designed to provide a new level of connectedness between major types of processors, including CPU, GPU, DSP, FPGA and other ...
[46]
[2506.06067] Efficient Memory Tiering in a Virtual Machine - arXiv
Jun 6, 2025 · Our evaluation of our technique on standalone real-world benchmarks with state-of-the-art host-based tiering show 50-70% reduction in near ...
[47]
Top Data Center Virtualization Trends in 2022
Jun 25, 2022 · Virtual Memory at Petabyte Scale. Applications are far hungrier than they used to be. The surge in compute power coupled with advances in memory ...