Kernel-based Virtual Machine
The Kernel-based Virtual Machine (KVM) is an open-source virtualization module integrated into the Linux kernel that enables the kernel to operate as a type-1 hypervisor, allowing multiple isolated virtual machines (VMs) to run on a single physical host with near-native performance.[1][2] It provides full hardware-assisted virtualization primarily for x86 architectures equipped with extensions such as Intel VT-x or AMD-V, treating each VM as a standard Linux process while allocating virtualized resources like CPU, memory, and I/O devices.[2][3] KVM was originally developed by Avi Kivity at Qumranet and announced in 2006, with its core patch set merged into the Linux kernel mainline as part of version 2.6.20 in February 2007.[4][5] This integration marked a significant advancement in open-source virtualization, building on the growing availability of hardware virtualization support in processors from Intel and AMD.[6] Over the years, KVM has evolved through contributions from a global community of over 1,000 developers, celebrating its 10-year anniversary in 2016 and continuing to receive updates in subsequent kernel releases.[6][4] At its core, KVM consists of kernel modules—including the generickvm.ko and architecture-specific ones like kvm-intel.ko or kvm-amd.ko—that leverage the host kernel's existing components, such as the memory manager and process scheduler, to handle VM operations efficiently.[2][3] It is typically paired with userspace tools like QEMU for device emulation and VM management, enabling the launch of unmodified guest operating systems, including Linux and Windows, as isolated processes on the host.[2][1] This architecture supports a range of hardware platforms beyond x86, including ARM and IBM z Systems, and facilitates features like resource pooling across VMs.[1]
Key advantages of KVM include its high performance through hardware acceleration, which minimizes overhead and supports low-latency workloads; enhanced security via integrations like SELinux and sVirt for isolating VMs and protecting host resources; and cost efficiency as a mature, free technology backed by an active open-source community.[3][1] It enables practical use cases such as scaling cloud infrastructure, live VM migration without downtime, rapid deployment in data centers, and running legacy applications on modern hardware.[1][6] Widely adopted in enterprise environments by providers like Red Hat, AWS, and Ubuntu, KVM powers much of today's virtualized computing landscape.[1][3][6]
Introduction and Background
Overview
The Kernel-based Virtual Machine (KVM) is an open-source virtualization module integrated into the Linux kernel, enabling it to function as a type-1 hypervisor for creating and managing virtual machines (VMs).[1] First merged into the mainline Linux kernel in version 2.6.20, released on February 4, 2007, KVM leverages hardware virtualization extensions such as Intel VT-x or AMD-V to support full virtualization on x86 processors, allowing unmodified guest operating systems to run with near-native performance.[7] This integration transforms the host Linux kernel into a bare-metal hypervisor, providing efficient resource isolation and management without requiring a separate hosted hypervisor layer.[8] KVM operates by loading as a kernel module, which exposes a character device at/dev/kvm to facilitate interaction between the kernel and user-space applications.[2] In this architecture, KVM manages core virtualization tasks in kernel space, including CPU emulation, memory virtualization, and VM scheduling, while user-space components—typically QEMU—handle peripheral device emulation, I/O operations, and the execution of guest operating systems.[9] This division ensures that performance-critical operations remain in the kernel for low overhead, with user-space tools providing flexibility for device modeling and VM configuration.[10]
KVM supports multiple processor architectures, including x86-64, ARM64, PowerPC, IBM z/Architecture (s390), and RISC-V, allowing deployment across diverse hardware platforms. For enhanced I/O performance, it incorporates paravirtualization through the VirtIO framework, which provides semi-virtualized drivers that reduce emulation overhead by enabling direct communication between guest and host.[11] As of 2025, KVM remains a cornerstone of enterprise virtualization, powering server environments in distributions such as Red Hat Enterprise Linux and Oracle Linux, where it supports scalable VM deployments for cloud and data center applications.[12][1]
Historical Development
The Kernel-based Virtual Machine (KVM) originated from work begun in mid-2006 by Avi Kivity at Qumranet, an Israeli virtualization company, leveraging Intel VT and AMD-V hardware extensions to enable Linux kernel-based virtualization.[5][13] Kivity announced the initial version of KVM on October 19, 2006, via a post to the Linux kernel mailing list, marking its first public release as an out-of-tree module.[4] KVM's code was merged into the mainline Linux kernel as part of version 2.6.20, released on February 4, 2007, transitioning it from an external module to a core kernel component.[4] In September 2008, Red Hat acquired Qumranet for $107 million, integrating KVM further into its virtualization ecosystem and accelerating its development under open-source governance.[5] Avi Kivity served as the primary maintainer initially, with Paolo Bonzini taking over as the lead maintainer around 2012, guiding KVM's evolution through contributions from Red Hat and the broader Linux community.[14] Key early milestones included the introduction of live migration capabilities in 2007, allowing seamless transfer of running virtual machines between hosts to minimize downtime.[1] Around 2008–2010, VirtIO emerged as a paravirtualized I/O standard for KVM, with its drivers merged into the Linux kernel in 2008 and formalized through the OASIS VirtIO specification by 2016, enhancing device performance in virtual environments.[15][16] Support for ARM64 architectures arrived in Linux kernel 3.9, released in April 2013, enabling KVM on mobile and embedded systems.[17] RISC-V support followed in Linux kernel 5.16, released in December 2021, broadening KVM's applicability to open hardware architectures.[18] In recent years, KVM has seen enhancements focused on security and performance. The Linux 6.18 kernel, with stable release expected in late November 2025, introduced support for AMD Secure Encrypted Virtualization with Secure Nested Paging (SEV-SNP) features like CipherText Hiding, bolstering confidential computing protections against host-side attacks.[19] Throughout 2025, Ubuntu 22.04 LTS received multiple security patches for KVM, addressing vulnerabilities in subsystems such as the x86 architecture and block layer via updates like linux-kvm 5.15.0-1077.86.[20] Discussions at the KVM Forum 2025 in Milan explored techniques for applying kernel updates without requiring VM migration, using mechanisms like kexec to reboot the host kernel while preserving running guests.[21] The KVM Forum, an annual community event, has convened developers since 2009 to advance virtualization topics; the 2024 edition in Brno, Czech Republic, emphasized performance optimizations and security hardening through sessions on automated testing, nested virtualization, and migration efficiency.[22][23]Technical Architecture
Core Components and Internals
KVM operates as a loadable kernel module named kvm.ko, which provides the core virtualization infrastructure for the Linux kernel, enabling it to function as a type-1 hypervisor.[2] This module is supplemented by architecture-specific variants, such as kvm-intel.ko for Intel processors supporting VT-x and kvm-amd.ko for AMD processors supporting Secure Virtual Machine (SVM), which handle hardware-assisted virtualization extensions.[2] These modules are dynamically loaded and integrate seamlessly with the Linux kernel, leveraging its process and memory management subsystems without requiring a separate hypervisor kernel.[24] The interface between user-space applications and the KVM kernel module is exposed through the /dev/kvm character device file, which supports a range of ioctls for managing virtual machines (VMs) and virtual CPUs (vCPUs).[9] Key ioctls include KVM_CREATE_VM, invoked on /dev/kvm to create a new VM and return a VM-specific file descriptor; KVM_CREATE_VCPU, applied to the VM file descriptor to add a vCPU and obtain a vCPU file descriptor; KVM_SET_USER_MEMORY_REGION, used on the VM file descriptor to define memory slots mapping guest physical addresses to host memory; and KVM_RUN, executed on the vCPU file descriptor to start or resume guest code execution.[9] These ioctls form a hierarchical structure: system-level operations on /dev/kvm, VM-level on the VM descriptor, and vCPU-level on individual vCPU descriptors, ensuring isolated control over virtualization resources.[9] At its core, KVM employs a trap-and-emulate model for virtualization, where guest code executes directly on the host CPU in a restricted mode—VMX non-root mode for Intel VT-x or SVM guest mode for AMD SVM—allowing near-native performance for non-privileged operations.[24] When the guest attempts a privileged instruction, memory access violation, or other sensitive operation, it triggers a VM exit, trapping control to the host kernel in VMX root mode (Intel) or host mode (AMD), where KVM emulates the operation or forwards it to user-space as needed.[24] This mechanism relies on hardware virtualization extensions to minimize overhead, with VM exits handled efficiently by the KVM scheduler to resume guest execution promptly.[24] KVM integrates closely with user-space components like QEMU for comprehensive VM management, where QEMU handles device emulation and I/O while KVM focuses on CPU and memory virtualization. Specifically, KVM implements virtual CPUs as regular Linux threads, enabling the host's Completely Fair Scheduler (CFS) to manage vCPU scheduling alongside native processes, which simplifies resource allocation and ensures fair CPU time distribution.[25] For memory virtualization, KVM supports both shadow paging—where the host maintains a shadow copy of the guest's page tables—and hardware-accelerated two-dimensional paging via Extended Page Tables (EPT) on Intel or Nested Page Tables (NPT) on AMD, which map guest physical addresses directly to host physical addresses to reduce translation overhead.[26] Address space management in KVM involves mapping guest physical addresses (GPA) to host virtual addresses (HVA), typically through memory slots configured via KVM_SET_USER_MEMORY_REGION, where user-space allocates host memory and registers it with KVM for pinning and access control.[9] This mapping supports features like dirty logging, enabled by the KVM_MEM_LOG_DIRTY_PAGES flag, which tracks modified pages via a bitmap retrieved through KVM_GET_DIRTY_LOG, facilitating live migration and memory inspection without full emulation overhead.[9] Additionally, KVM accommodates memory ballooning through protocols like virtio-balloon, allowing dynamic adjustment of guest memory allocation by inflating or deflating a balloon device to reclaim or return host memory pages, optimizing overcommitment scenarios.[26] In Linux kernel 6.18, KVM introduced optimizations for AMD hardware, including enabling Secure Advanced Virtual Interrupt Controller (AVIC) support, which accelerates interrupt virtualization and reduces VM entry/exit latency for workloads with frequent interrupts.[19] These enhancements build on prior AMD SVM improvements, improving overall VM performance in nested and high-interrupt environments.[19]Hardware Support and Emulation
KVM relies on specific hardware virtualization extensions provided by the host CPU to enable efficient virtual machine execution. Mandatory requirements include Intel VT-x for basic virtualization on x86 processors, augmented by Extended Page Tables (EPT) for accelerated memory management, or the equivalent AMD-V (Secure Virtual Machine) with Rapid Virtualization Indexing (RVI), also known as Nested Page Tables (NPT), on AMD platforms. These extensions allow the hypervisor to trap and emulate sensitive instructions while minimizing overhead. Optional but recommended features include Intel VT-d or AMD-Vi for Input-Output Memory Management Unit (IOMMU) support, which facilitates secure direct device assignment (PCI passthrough) by isolating device DMA traffic. Without these core extensions, KVM cannot operate in hardware-accelerated mode and falls back to software emulation, which is significantly slower.[27][28] The primary supported architecture for KVM is x86-64, where it leverages mature hardware virtualization capabilities for broad compatibility. Support extends to ARM64 (AArch64) processors with virtualization extensions (EL2), including emulation of the Generic Interrupt Controller (GIC) versions 2 and 3 to handle guest interrupts efficiently. PowerPC (PPC64) platforms, particularly those based on IBM's pSeries, are supported via KVM on Linux distributions like those from Red Hat. The s390 architecture, used in IBM mainframes, integrates KVM for z/VM-like virtualization with architecture-specific optimizations. RISC-V support was introduced in Linux kernel 5.18 in 2022, enabling KVM on 64-bit (RV64) implementations with hypervisor extensions (H-extension), initially focusing on basic CPU and memory virtualization.[9][29][30] KVM's emulation strategy emphasizes minimal intervention in the kernel, confining most device and peripheral emulation to user-space processes for modularity and security. The kernel module handles core VM lifecycle, CPU scheduling, and memory protection, while offloading I/O emulation to tools like QEMU, which provides comprehensive device models through dynamic binary translation or hardware acceleration. For lighter-weight scenarios, alternatives such as crosvm (a Rust-based VMM from Google for Chrome OS) or Firecracker (Amazon's microVM for serverless workloads) integrate with KVM to emulate only essential peripherals, reducing attack surface and boot times to under 125 milliseconds. Firmware emulation is managed via SeaBIOS for legacy BIOS compatibility or OVMF (an open-source UEFI implementation) for modern boot processes, ensuring guests can initialize hardware as if on physical systems. This hybrid approach allows KVM to scale from full-system emulation to paravirtualized environments.[31][32][33] In terms of emulated hardware, KVM supports virtual CPUs (vCPUs) scaled up to the host's physical core count, enabling multi-threaded guest workloads with features like CPU hotplug. Memory allocation mirrors host RAM limits, with virtio-balloon drivers allowing dynamic resizing to optimize resource sharing across VMs. Basic I/O subsystems, including PCI buses for expansion cards and USB controllers (e.g., UHCI or EHCI models), are emulated primarily through QEMU's device backends, providing guests with standardized interfaces for peripherals like keyboards, storage, and graphics. These emulations ensure compatibility but introduce latency compared to native hardware.[34][27] To mitigate emulation overhead, KVM distinguishes paravirtualization techniques, particularly through the VirtIO standard, which exposes semi-virtualized devices to guests via simple ring buffers and shared memory. VirtIO drivers for block storage (virtio-blk), networking (virtio-net), and serial consoles (virtio-console) bypass full hardware emulation by allowing direct kernel-to-kernel communication, achieving near-native I/O throughput—up to 10 Gbps for networking on modern hosts. Guests must install these drivers (available in Linux and Windows) to benefit, reducing CPU cycles spent on trap-and-emulate cycles by orders of magnitude.[11][35] Recent advancements include RISC-V vector extension emulation in Linux kernel 6.10 (released July 2024), enabling KVM to support the RVA22 profile's scalable vector processing for guests on hosts lacking native hardware vectors, through software fallback mechanisms. On ARM64, improvements in 2025 introduced support for the Arm Confidential Compute Architecture (CCA) in KVM, allowing protected VMs (realms) with features like granule protection faults, realm entry/exit handling, and enhanced VGIC/timer support, based on the RMM v1.0 specification for end-to-end memory encryption and attestation. These updates expand KVM's applicability to emerging secure and vector-accelerated workloads.[36][37]Features and Capabilities
Key Virtualization Features
KVM provides robust CPU virtualization capabilities, allowing the creation of multiple virtual CPUs (vCPUs) to emulate symmetric multiprocessing (SMP) environments for guest operating systems. This enables guests to leverage multi-core processing for enhanced performance in parallel workloads. KVM also supports vCPU overcommitment, where the total number of vCPUs across all guests can exceed the host's physical CPU cores, with the Linux scheduler managing time-sharing to maintain efficiency. Dynamic vCPU hotplug, which permits adding or removing vCPUs during guest runtime without rebooting, was introduced in Linux kernel version 3.10, released in 2013.[34][38] Memory management in KVM emphasizes flexibility and efficiency through dynamic allocation, where host memory can be adjusted for guests on demand to optimize resource utilization. A key mechanism is memory ballooning, implemented via a paravirtualized balloon driver in the guest that inflates to reclaim memory for the host or deflates to return it, facilitating sharing without significant performance degradation. Additionally, KVM supports huge pages (typically 2 MB or 1 GB), which reduce translation lookaside buffer (TLB) misses and improve memory access speeds in I/O-intensive or large-memory scenarios.[39][40][41] Live migration enables seamless transfer of running virtual machines between hosts with minimal interruption, using pre-copy and post-copy methods to handle memory transfer. In pre-copy, iteratively copied pages are tracked for changes (dirty pages) until the working set stabilizes, introduced alongside early KVM development in 2007; post-copy, which suspends the guest briefly, resumes execution on the destination host, and fetches remaining pages on fault, was proposed for KVM in 2009 and enhanced with better fault handling and recovery in 2015. These techniques achieve downtimes of under a second for typical workloads, supporting high-availability environments.[42][43] KVM includes in-kernel mechanisms for snapshotting and checkpointing, allowing the saving and restoring of complete VM states, including CPU registers, memory, and device contexts, through ioctls like KVM_SET/GET_VCPU_EVENTS and memory slot management. This facilitates backup, debugging, and rapid recovery without full guest reboots. Guest OS compatibility is broad, with native support for Linux distributions, Windows (optimized via VirtIO drivers for storage and networking), BSD variants, and Solaris, ensuring near-native performance across diverse environments. Nested virtualization, enabling hypervisors to run inside guest VMs, has been available since 2010, supporting advanced testing and development scenarios.[44] As of Linux kernel 6.13 (released January 2025), KVM added support for Arm's Confidential Compute Architecture (CCA), enabling protected virtual machines with enhanced security isolation. Further enhancements in kernel 6.14 (released March 2025) include improved RISC-V guest support for extensions like Zabha, Svvptc, and Ziccrse.[45][46]Device Emulation and Paravirtualization
KVM handles guest device input/output primarily through full emulation provided by QEMU, which simulates hardware components in user space to support legacy devices incompatible with modern virtualization techniques.[47] For instance, QEMU emulates IDE disk controllers for compatibility with older operating systems, standard VGA graphics adapters for basic display output, and sound devices such as AC97 or SB16 for audio playback. This approach involves trapping guest I/O instructions into the host kernel via KVM, where QEMU interprets and emulates the operations, resulting in significant performance overhead from frequent VM exits and context switches.[48] To mitigate the limitations of full emulation, KVM employs paravirtualization via the VirtIO standard, introduced in 2008 as a de facto interface for efficient virtual I/O across hypervisors.[49] VirtIO devices present a semi-virtualized PCI interface to the guest, requiring paravirtualized drivers in the guest OS that communicate with the host using shared ring buffers known as vrings, which enable batched, zero-copy data transfers and reduce trap frequency.[50][49] Key examples include virtio-blk for block storage, which uses a single queue for read/write operations with sector-based addressing; virtio-net for networking, supporting transmit/receive queues with offload features like checksum and TSO; and virtio-gpu for accelerated graphics, providing 2D/3D rendering via shared memory.[49][51] These guest drivers minimize emulation needs by handling device-specific logic, achieving high performance close to native I/O while maintaining broad compatibility.[50] For scenarios demanding near-native performance, KVM supports PCI passthrough, allowing direct assignment of host PCI devices to guests via the VFIO framework, which provides IOMMU-protected access without emulation overhead.[52] Introduced in Linux kernel 3.6 in 2012, VFIO enables safe userspace binding of devices like GPUs or NICs, isolating them in IOMMU groups to prevent unauthorized DMA and deliver bare-metal driver performance to the guest.[53][52] USB and input device handling in KVM relies on QEMU's emulation of USB controllers, supporting USB 2.0 via EHCI and USB 3.0 via XHCI, alongside virtual USB devices for peripherals like keyboards and mice. These emulations allow guests to interact with virtual or passed-through USB hardware, with PS/2 or USB tablet models ensuring seamless input capture. For remote access, the SPICE protocol integrates with QEMU to stream display output and relay input events over dedicated channels, supporting keyboard, mouse, and multi-monitor setups with low-latency client-side rendering.[54] Recent advancements enhance VirtIO efficiency in KVM, including full support for the VirtIO 1.1 specification in Linux kernel 5.15 released in 2021, which introduces features like packed virtqueues for reduced descriptor overhead and improved live migration compatibility.[51][55]Management and Tools
Command-Line and API Interfaces
The primary command-line interface for managing KVM virtual machines is virsh, a shell provided by the libvirt library, which enables administrators to handle the full lifecycle of guest domains.[56] Common operations include listing active domains withvirsh list, starting a domain via virsh start <domain>, stopping it with virsh shutdown <domain>, and pausing or resuming execution as needed.[57] This tool abstracts interactions with the underlying hypervisor, allowing domain creation from XML definitions and configuration edits without direct kernel access.[58]
For lower-level control, the qemu-kvm binary serves as the direct emulator invocation point, integrating KVM acceleration when specified.[59] It is typically launched with options such as -enable-kvm to activate hardware-assisted virtualization and -m <size> to allocate guest memory, for example, qemu-kvm -enable-kvm -m 2048 -drive file=disk.img. This approach bypasses higher-level management layers for custom or debugging scenarios, though it requires manual handling of device emulation and networking.
Libvirt provides a stable C API for programmatic access, facilitating connections to the KVM device at /dev/kvm through QEMU processes managed by the library.[60] Developers can use functions like virConnectOpen to establish a hypervisor connection and virDomainCreate to launch domains, enabling embedded integration in C applications. For scripting, the libvirt-python bindings offer Pythonic wrappers around this API, supporting automation tasks such as domain monitoring and resource allocation via modules like libvirt and libvirt.qemu.[61] These bindings, available through PyPI, allow scripts to interact with KVM guests in a cross-platform manner.[62]
Disk management in KVM environments often leverages qemu-img, a utility for creating, resizing, and converting virtual disk images in formats like QCOW2 or raw. For instance, qemu-img create -f qcow2 disk.img 20G initializes a 20 GB sparse image, while qemu-img resize disk.img +10G expands an existing one for guest use. Automated VM provisioning is streamlined with virt-install, part of the virt-install package, which defines and deploys guests from command-line arguments, including ISO installation media and network bridges, as in virt-install --name guest --ram 1024 --disk path=disk.img --cdrom install.iso --os-variant rhel8.
KVM interfaces integrate with orchestration platforms through libvirt hooks, such as those used by OpenStack Nova for compute node operations, where Nova's libvirt driver provisions and migrates VMs via KVM.[63] Similarly, KubeVirt extends Kubernetes to run container-native VMs on KVM, encapsulating QEMU processes in pods for unified workload management.[64]
Libvirt 10.0, released in January 2024, includes improvements such as postcopy-preempt migration for faster QEMU VM migrations.[65] Subsequent releases, like 10.5.0 in July 2024, introduced support for AMD SEV-SNP confidential computing. As of November 2025, libvirt 11.9.0 adds features like Hyper-V host-model mode, enhancing cross-architecture compatibility and security isolation in API-driven deployments.[65]