Express Data Path
eXpress Data Path (XDP) is a high-performance, programmable networking framework integrated into the Linux kernel that enables fast packet processing directly within the kernel's network driver context, allowing for efficient handling of incoming network packets at the earliest possible stage without requiring kernel bypass techniques.[1] Developed as part of the IO Visor Project, XDP leverages eBPF (extended Berkeley Packet Filter) programs to inspect, modify, forward, or drop packets, providing a safe and flexible environment for custom data plane operations while maintaining compatibility with the existing Linux networking stack.[2][1]
XDP was first introduced in 2016 through contributions from developers at Facebook and Red Hat, with its design formalized in a 2018 research paper presented at the ACM CoNEXT conference, marking its integration into the mainline Linux kernel starting from version 4.8.[2][1] The framework executes eBPF bytecode—compiled from high-level languages like C—early in the receive (RX) path of network interface controllers (NICs), enabling decisions such as packet rejection before memory allocation or stack traversal, which minimizes overhead and enhances security by avoiding userspace involvement for common tasks.[1][2] Key actions supported include XDP_DROP for discarding packets, XDP_PASS for forwarding to the kernel stack, XDP_TX for immediate transmission, and XDP_REDIRECT for rerouting to other interfaces or sockets, all verified at load time via static analysis to prevent kernel crashes.[1]
In terms of performance, XDP achieves up to 24 million packets per second (Mpps) on a single core using commodity hardware, outperforming traditional kernel paths and even some userspace solutions by reducing latency and CPU utilization for high-throughput scenarios.[1] It supports advanced features like stateful processing through eBPF maps for hash tables and counters, as well as integration with AF_XDP sockets for zero-copy user-space access, making it suitable for applications such as DDoS mitigation, load balancing, and inline firewalls.[2][1] Since its inception, XDP has been adopted in production environments by organizations like Red Hat and Cilium, with ongoing enhancements in recent Linux kernels expanding hardware offload support and metadata access for even greater efficiency.[3][1]
Overview
Definition
Express Data Path (XDP) is an eBPF-based technology designed for high-performance network packet processing within the Linux kernel. It integrates directly into the network interface card (NIC) driver at the earliest receive (RX) point, allowing eBPF programs to execute on incoming packets before they proceed further into the kernel.[4][2]
The core purpose of XDP is to enable programmable decisions on incoming packets prior to kernel memory allocation or involvement of the full networking stack, thereby minimizing overhead and maximizing throughput. This approach supports processing rates up to 26 million packets per second per core on commodity hardware.[5]
In contrast to traditional networking paths, XDP bypasses much of the operating system stack—for instance, avoiding initial allocation of socket buffer (skb) structures—to achieve lower latency and reduced CPU utilization.[6] Originally developed as a GPL-licensed component of the Linux kernel, XDP received a Windows port in 2022, released under the MIT license.[7] As of 2025, developments like XDP2 are being proposed to further extend its capabilities for modern high-performance networking.[8]
Advantages
XDP provides significant performance benefits by enabling line-rate packet processing directly in the network driver, achieving throughputs exceeding 100 Gbps on multi-core systems while maintaining low latency. This is accomplished by executing eBPF programs at the earliest possible stage in the receive path, before the creation of socket buffer (skb) structures or the invocation of generic receive offload (GRO) and segmentation offload (GSO) layers, which reduces processing overhead for high-volume traffic scenarios such as DDoS mitigation and traffic filtering. For instance, simple packet drop operations can reach up to 20 million packets per second (Mpps) per core, far surpassing traditional methods.[9][10]
In terms of resource efficiency, XDP minimizes CPU utilization by allowing early decisions on packet fate—such as dropping invalid packets—thereby freeing kernel resources for other tasks and avoiding unnecessary memory allocations or context switches deeper in the networking stack. This approach supports scalable deployment across multiple cores without the need for kernel bypass techniques like DPDK, while retaining the security and interoperability of the Linux networking subsystem. Additionally, XDP's potential for zero-copy operations further reduces memory bandwidth consumption, enhancing overall system efficiency in bandwidth-intensive environments.[11][6][10]
The flexibility of XDP stems from its integration with eBPF, enabling programmable custom logic for packet processing without requiring kernel modifications or recompilation, which facilitates rapid adaptation to evolving network requirements. Compared to conventional tools like iptables or nftables, XDP can be significantly faster for basic filtering tasks, with speedups of up to 5 times, due to its position in the data path and avoidance of higher-layer overheads. Furthermore, XDP enhances ecosystem observability through seamless integration with tools like bpftrace, allowing for efficient monitoring and debugging of network events in production environments.[9][11][10]
History and Development
Origins
The development of Express Data Path (XDP) was initiated in 2016 by Jesper Dangaard Brouer, a principal kernel engineer at Red Hat, in response to the growing demands for high-performance networking in cloud computing environments where traditional Linux kernel networking stacks struggled with speeds exceeding 10 Gbps.[12] Traditional kernel processing, including socket buffer (SKB) allocation and memory management, created significant bottlenecks under high packet rates, often limiting throughput to below line-rate performance for multi-gigabit interfaces.[1] The project aimed to enable programmable, kernel-integrated packet processing that could rival user-space solutions like DPDK while maintaining compatibility with the existing networking stack.[13]
Key contributions came from the open-source Linux kernel community, with significant input from engineers at Google, Amazon, and Intel, who helped refine the design through collaborative patch reviews and testing.[1] Early efforts built upon the eBPF (extended Berkeley Packet Filter) framework, which had advanced in 2014 to support more complex in-kernel programs, allowing XDP to extend programmable packet processing beyond existing hooks like traffic control (tc).[12]
Initial prototypes focused on integrating XDP hooks into network drivers, with testing conducted on Netronome SmartNICs to evaluate offloading capabilities and on Mellanox ConnectX-3 Pro adapters (supporting 10/40 Gbps Ethernet) to demonstrate drop rates up to 20 million packets per second on a single core.[12] These prototypes validated the feasibility of early packet inspection and processing directly in the driver receive path, minimizing overhead from higher-layer kernel components.[1]
Milestones
XDP was initially merged into the Linux kernel version 4.8 in 2016, introducing basic support for programmable packet processing at the driver level, with initial implementation in the Intel ixgbe Ethernet driver.[14]
In 2018, Linux kernel 4.18 added AF_XDP, a socket family enabling efficient user-space access to XDP-processed packets, facilitating zero-copy data transfer between kernel and user space.[15]
Microsoft ported XDP to the Windows kernel in 2022, releasing an open-source implementation that integrated with the MsQuic library to accelerate QUIC protocol processing by bypassing the traditional network stack.[16]
Between 2023 and 2024, XDP driver support expanded to additional Intel Ethernet controllers, such as the E810 series, while Netronome hardware offloading achieved greater stability through kernel enhancements for reliable eBPF program execution on smart NICs.[17][4]
In 2024 and 2025, kernel updates addressed critical issues, including a fix for race conditions in the AF_XDP receive path identified as CVE-2025-37920, where improper synchronization in shared umem mode could lead to concurrent access by multiple CPU cores; this was resolved by relocating the rx_lock to the buffer pool structure.[18] The eBPF ecosystem around XDP also grew, with the introduction of uXDP as a userspace runtime for executing verified XDP programs outside the kernel while maintaining compatibility, and innovative workarounds enabling XDP-like processing for egress traffic via kernel loopholes.[19][20]
XDP's core implementation in Linux remains under the GPL license, ensuring integration with the kernel's licensing requirements, whereas the Windows port adopts the more permissive MIT license to broaden adoption across platforms.
Core Functionality
Data Path Mechanics
The eXpress Data Path (XDP) hook is integrated at the earliest point in the receive (RX) path within the Linux kernel's network device driver, immediately following the network interface card (NIC)'s direct memory access (DMA) transfer of packet data into kernel memory buffers from the RX descriptor ring, but prior to any socket buffer (skb) allocation or engagement with the broader network stack. This placement minimizes latency by allowing programmable processing before traditional kernel overheads. In cases where a driver lacks native support, XDP falls back to a generic mode (also known as SKB mode) that integrates into the kernel's NAPI processing after skb allocation, resulting in slightly higher overhead but ensuring compatibility.[21][22][23]
Upon DMA transfer, the raw packet data resides in a kernel buffer, where an eBPF program attached to the XDP hook executes directly on it, utilizing metadata from the xdp_md structure—such as packet length, ingress port, and RX queue ID—for contextual analysis. This flow enables rapid decision-making on packet disposition without propagating the frame through the full kernel network stack, thereby reducing CPU cycles and memory usage for high-throughput scenarios. XDP supports three execution modes: native mode, which embeds the hook directly in the driver for optimal performance on supported hardware; generic mode, a universal software fallback that integrates into the standard RX path with slightly higher overhead; and offload mode, where the eBPF program is transferred to the NIC for hardware-accelerated execution, bypassing the host CPU entirely.[21][24][4]
To enhance efficiency, XDP leverages the kernel's page pool API for memory management, allocating and recycling page-sized pools dedicated to XDP frames and associated skbs, which avoids frequent page allocations and reduces cache misses in high-rate environments. This approach supports multicast traffic handling, where replicated packets can be processed across relevant queues, and integrates with Receive Side Scaling (RSS) to distribute ingress load via hardware hashing to multiple RX queues for parallel eBPF execution. Traditionally limited to ingress processing on the RX path, XDP saw 2025 advancements enabling egress support through eBPF-based techniques that manipulate kernel packet direction heuristics, extending its applicability to outbound traffic without native TX hooks.[25][26][20]
Actions
In eXpress Data Path (XDP), the possible decisions an XDP program can make on a received packet are determined by returning one of the values from the enum xdp_action, which the kernel uses to execute the corresponding handling without further program involvement.[27] These actions enable efficient packet processing at the driver level, allowing for high-performance decisions such as dropping unwanted traffic or redirecting packets to alternative paths.[4]
XDP_DROP instructs the kernel to immediately discard the packet, freeing the underlying DMA buffer directly in the driver without allocating kernel data structures like sk_buff or passing the packet to the network stack. This action is particularly effective for early-stage filtering, such as mitigating DDoS attacks, as it minimizes resource consumption and latency compared to traditional stack-based dropping.[21]
XDP_PASS forwards the packet to the standard Linux kernel networking stack for further processing, such as routing, firewalling, or delivery to user space.[4] It allows the XDP program to inspect or minimally modify the packet before normal handling resumes, preserving compatibility with existing network functionality.
XDP_TX causes the kernel to transmit the packet back out through the same network interface it arrived on, often used for reflecting packets or simple redirects without changing the egress device.[21] This action reuses the original buffer for transmission, enabling low-overhead operations like packet mirroring or bouncing invalid ingress traffic.[4]
XDP_REDIRECT redirects the packet to a different network interface, CPU queue, or AF_XDP socket, typically invoked via the eBPF helper bpf_redirect() or map-based variants like bpf_redirect_map(). It supports advanced forwarding scenarios, such as load balancing across devices, by handing off the buffer to another driver or processing context.[21]
XDP_ABORTED, with a value of 0, signals an error or abort condition in the XDP program, resulting in the packet being dropped along with a kernel warning via bpf_warn_invalid_xdp_action().[27] This action is primarily intended for debugging or testing purposes and is rarely used in production environments due to its punitive overhead.[4]
The kernel interprets the returned enum xdp_action value to perform the specified operation atomically after the program execution, ensuring minimal overhead in the data path. Statistics for these actions, including counts of drops, passes, transmissions, and redirects, are exposed by supported network drivers through the ethtool utility, allowing administrators to monitor XDP performance and efficacy.[28]
eBPF Integration
Program Development
eBPF programs for XDP are written in a restricted subset of the C programming language, leveraging kernel headers such as <linux/bpf.h> and <bpf/bpf_helpers.h> to access necessary types and helper functions.[29] Developers define the main program function and annotate it with the SEC("xdp") macro to place it in the appropriate ELF section, ensuring it is recognized as an XDP program during loading.[30] The function signature typically takes a struct xdp_md *ctx parameter, providing access to packet metadata like ingress_ifindex for the incoming interface index.[4] Programs must return an enum xdp_action value, such as XDP_DROP to discard packets or XDP_PASS to continue processing.[29]
To compile the C source into an ELF object file containing eBPF bytecode, developers use LLVM/Clang with the BPF target architecture. The command clang -O2 -target bpf -c program.c -o program.o generates the object file, enabling features like bounded loops and helper function inlining supported by the LLVM BPF backend.[31] This process ensures the bytecode adheres to eBPF instruction constraints verified by the kernel.
Loading the program into the kernel utilizes the libbpf library, which provides the bpf_prog_load() function with BPF_PROG_TYPE_XDP as the program type.[32] Once loaded, the program file descriptor is attached to a network device using bpf_set_link_xdp_fd() on the netdevice or, in newer kernels, bpf_link_create() with BPF_LINK_TYPE_XDP.[4] Alternatively, the iproute2 suite offers a command-line interface for attachment: ip link set dev <interface> xdp obj program.o sec xdp, simplifying deployment without custom userspace code. For inspection and management, bpftool from iproute2 allows querying loaded programs via bpftool prog show or attached XDP links with bpftool net show.
A representative example is a simple XDP program that drops packets with non-IPv4 Ethernet types:
c
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/if_ether.h>
SEC("xdp")
int xdp_drop_non_ip(struct xdp_md *ctx) {
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
if ((void *)(eth + 1) > data_end)
return XDP_PASS;
if (eth->h_proto != htons(ETH_P_IP))
return XDP_DROP;
return XDP_PASS;
}
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/if_ether.h>
SEC("xdp")
int xdp_drop_non_ip(struct xdp_md *ctx) {
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
if ((void *)(eth + 1) > data_end)
return XDP_PASS;
if (eth->h_proto != htons(ETH_P_IP))
return XDP_DROP;
return XDP_PASS;
}
This program accesses the Ethernet header via the ctx metadata, performs bounds checking to prevent verifier rejection, and selectively drops non-IP traffic.[33] Metadata like ctx->ingress_ifindex can be used for interface-specific logic, such as conditional actions based on the receiving device.[4]
Debugging XDP programs involves kernel-side tracing with bpf_trace_printk() for logging messages to the kernel ring buffer, viewable via dmesg, though it is limited for production due to performance overhead. For more scalable telemetry, developers populate userspace-accessible eBPF maps with counters or statistics, which can be read and aggregated from userspace applications.[34] The kernel verifier plays a role in accepting programs by statically analyzing bytecode for safety, but development focuses on iterating through compilation and loading to resolve verification failures.[4]
Safety Mechanisms
The eBPF verifier serves as a critical in-kernel static analyzer for XDP programs, simulating their execution path to ensure safety before loading. It performs exhaustive checks for potential issues such as unreachable instructions, out-of-bounds memory accesses relative to packet boundaries (e.g., ensuring offsets do not exceed the data_end pointer in XDP contexts), invalid use of helper functions, and violations of the kernel's security model. If any unsafe behavior is detected, the verifier rejects the program, preventing it from being loaded and executed, thereby avoiding kernel crashes or exploits. This verification process is mandatory for all eBPF program types, including XDP, and operates on the program's bytecode without requiring runtime overhead during packet processing.[35][4]
To enforce bounded execution, the verifier prohibits unbounded loops in eBPF programs, a restriction that originated with early eBPF designs to guarantee termination; since Linux kernel 5.3, bounded loops are permitted but only if the verifier can prove they will not exceed resource limits. A key safeguard is the fixed instruction limit, capped at 1 million instructions per program invocation (increased from 4,096 in earlier kernels like 5.2), which prevents excessive computation and potential denial-of-service scenarios. Additionally, map accesses—such as those to eBPF maps used for state in XDP filtering—are validated at load time, ensuring pointers remain within allocated bounds and avoiding arbitrary memory corruption. These mechanisms collectively ensure that XDP programs remain deterministic and resource-bounded, maintaining kernel stability even under high packet rates.[36][37]
Following successful verification of the bytecode, the kernel may optionally apply just-in-time (JIT) compilation to translate it into native machine code for improved performance during execution. However, the verifier's safety checks occur solely on the portable bytecode, independent of the JIT process, ensuring that optimizations do not introduce vulnerabilities. For error handling, XDP programs must return one of predefined actions (e.g., XDP_PASS to continue processing, XDP_DROP to discard the packet, or XDP_REDIRECT for forwarding), which the kernel interprets to dictate packet fate; in cases of unrecoverable errors like division by zero, the program returns XDP_ABORTED, triggering a kernel tracepoint for logging while defaulting to a safe fallback such as XDP_PASS to avoid disrupting traffic flow.[38][39][4]
In recent Linux kernel developments through 2025, the eBPF verifier has seen enhancements to support more complex operations, including improved precision for packet redirects (e.g., via XDP_REDIRECT with tail calls) and metadata handling in XDP programs, where additional packet metadata can be safely accessed without bound violations. These updates, such as proof-based refinement mechanisms, allow the verifier to handle intricate control flows more accurately while rejecting fewer valid programs, building on ongoing efforts to balance safety and expressiveness in high-performance networking scenarios.[40][41]
User-Space Access
AF_XDP Sockets
AF_XDP sockets, introduced in Linux kernel version 4.18, provide a specialized address family (PF_XDP) designed for high-performance, zero-copy input/output operations that enable direct packet transfer from kernel-space XDP programs to user-space applications, bypassing much of the traditional networking stack.[42] This raw socket type facilitates efficient packet processing by allowing XDP eBPF programs to redirect ingress traffic straight to user-space buffers, supporting applications requiring low-latency and high-throughput networking.
To create an AF_XDP socket, applications invoke the standard socket syscall with the address family AF_XDP, socket type SOCK_RAW, and protocol 0: fd = socket(AF_XDP, SOCK_RAW, 0);. Following creation, the socket must be bound to a specific network interface and receive queue ID using the bind() syscall, specifying parameters such as the interface index, queue identifier, and socket options via setsockopt() for features like shared user memory (UMEM) registration. This binding associates the socket with a particular hardware receive queue, enabling targeted packet reception from XDP-processed traffic on that queue.
The core of AF_XDP's efficiency lies in its user memory (UMEM) model, where user-space allocates a contiguous memory region registered with the kernel via setsockopt() using the SOL_XDP level. This UMEM is divided into fixed-size frames, and communication between kernel and user-space occurs through four lock-free ring buffers: the RX ring for incoming packet descriptors from the kernel to user-space, the TX ring for outgoing descriptors from user-space to the kernel, the FILL ring for user-space to supply empty frames to the kernel, and the COMPLETION ring for the kernel to notify user-space of processed frames. Descriptors in these rings reference UMEM frame addresses and lengths, allowing shared access without data copying in optimal configurations.
AF_XDP supports two operational modes for packet handling: copy mode, which relies on traditional sk_buff structures for data transfer and is compatible with all XDP-capable drivers, and zero-copy mode, which grants the driver direct page access to UMEM via DMA for ingress and egress, minimizing overhead but requiring explicit driver support through XDP_FLAGS_DRV_MODE.[43] Upon binding, the kernel attempts zero-copy if available; otherwise, it defaults to copy mode. Driver support for zero-copy has expanded in recent kernels, enhancing performance for supported hardware.[43]
As of 2025, AF_XDP has seen integrations aimed at broader ecosystem compatibility, including patches in the DPDK framework to enable AF_XDP poll-mode drivers for seamless migration from kernel-bypass libraries, allowing DPDK applications to leverage AF_XDP sockets for raw packet I/O on supported NICs.[44] Additionally, experimental implementations in DNS servers such as NSD utilize AF_XDP to achieve higher query processing rates by directly handling UDP packets in user-space, demonstrating improved query processing rates with minimal CPU overhead, such as a 1.7x improvement over traditional UDP handling in experimental tests.[45][46]
Zero-Copy Features
AF_XDP enables zero-copy packet handling through a shared memory region known as UMEM, which consists of a contiguous block of user-allocated memory divided into fixed-size frames, typically 2 KB or 4 KB each, to hold packet data without intermediate copies between kernel and user space.[47] The kernel driver writes packet descriptors directly into ring buffers mapped to this UMEM, allowing the network interface card (NIC) to DMA packet data straight into the frames, while the user-space application accesses the data via these descriptors.[47] This structure supports multiple AF_XDP sockets sharing the same UMEM for efficient resource utilization in multi-queue setups.[47]
Ring buffer operations in zero-copy mode rely on four memory-mapped rings associated with the UMEM: the fill ring, where the user space provides available frames for incoming packets; the RX ring, where the kernel enqueues receive descriptors pointing to filled frames; the TX ring, for user-submitted transmit descriptors; and the completion ring, where the kernel signals TX completions.[47] The user space polls the head and tail pointers of these single-producer/single-consumer rings to synchronize access, minimizing system calls through techniques like busy-polling or eventfd notifications, while the kernel updates them atomically to reflect buffer states.[47] This design ensures seamless data flow without memcpy operations, as both kernel and user space operate on the shared memory.[43]
By eliminating the overhead of data copying between kernel and user space, zero-copy AF_XDP achieves significant performance improvements, such as line-rate processing exceeding 40 Gbps for receive-only workloads in user-space applications like packet capture on 40 Gbps NICs.[48] These gains stem from reduced CPU cycles on memory transfers and fewer context switches, enabling applications to handle high-throughput traffic more efficiently than traditional socket interfaces.[49]
To enable zero-copy mode, applications must set the XDP_ZEROCOPY flag during socket binding via the bind() system call, which requires compatible NIC drivers supporting direct UMEM access, such as Intel's i40e for 40 Gbps Ethernet; if unsupported, the operation falls back to copy mode using SKB buffers.[47][50] Driver support is typically provided in XDP_DRV mode for native zero-copy, contrasting with the generic XDP_SKB mode that always copies data.[47]
In 2025, advancements addressed reliability issues, including a fix for race conditions in the generic RX path under shared UMEM scenarios (CVE-2025-37920), where improper locking could lead to data races across multiple sockets, now resolved by relocating the rx_lock to the buffer pool level in Linux kernel versions post-6.9.[18] Performance studies on mixed-mode deployments, combining zero-copy and copy-based sockets on programmable NICs, highlighted scalability benefits but noted potential bottlenecks from uneven buffer allocation, informing optimizations for hybrid environments.[51]
Hardware Support
Offloading Modes
XDP supports hardware offloading through specific modes that enable execution of programs directly on the network interface card (NIC), bypassing the host CPU for packet processing. The primary mode is specified by the XDP_FLAGS_HW_MODE flag, which attaches the eBPF program for full offload to the NIC when both driver and hardware support this capability.[4] As a fallback when hardware offload is unavailable or unsupported, XDP_FLAGS_SKB_MODE is used, directing the program to run in software mode using the kernel's socket buffer (SKB) path. Additionally, launch-time offload for transmit (TX) metadata allows the NIC to schedule packets based on specified timestamps without host intervention, merged in Linux kernel 6.14 (April 2025).[52][53]
The offloading process involves compiling the eBPF program into a format compatible with the target hardware, such as P4 for programmable switches or NIC-specific bytecode, before loading it onto the device. This compilation ensures the program adheres to the hardware's instruction set limitations. The program is then loaded using the devlink interface, a kernel subsystem for managing device resources, which handles the transfer to the NIC firmware. The kernel verifier performs compatibility checks during loading to confirm that the program and hardware align, preventing mismatches that could lead to failures. Driver-specific hooks facilitate the attachment, ensuring seamless integration with the NIC's data path.[54][55]
Offloading provides significant benefits, including zero involvement from the host CPU after initial setup, enabling line-rate packet processing on SmartNICs even under high traffic loads. It supports core XDP actions such as DROP and TX entirely on the hardware, allowing packets to be filtered or forwarded without reaching the host stack, which is particularly useful for security and performance-critical applications.[56][57]
However, hardware offload is constrained by a subset of eBPF features, excluding complex operations like advanced map manipulations or certain helper functions to match hardware capabilities. It also requires periodic NIC firmware updates to incorporate new offload support, limiting adoption to compatible devices.[58]
Integration with Time-Sensitive Networking (TSN) has advanced, enabling XDP offload to support deterministic traffic scheduling in industrial and real-time environments.[59]
Supported Devices
Express Data Path (XDP) hardware support is available on select network interface controllers (NICs) and platforms, enabling native or offloaded execution of XDP programs for high-performance packet processing.
Intel Ethernet controllers provide full native XDP offload through the ice driver for E810 series devices, supporting XDP and AF_XDP zero-copy operations on Linux kernels 4.14 and later.[60] The 700-series controllers, such as those based on X710, achieve similar support via the i40e driver for native XDP on kernels 4.14 and later, with iavf handling virtual functions in SR-IOV configurations on kernels 5.10 and above.[61][62]
Netronome Agilio SmartNICs have offered early and stable XDP offload support since 2016, allowing eBPF/XDP programs to execute directly on the NIC hardware for packet filtering and processing tasks.[63]
NVIDIA (formerly Mellanox) ConnectX-6 and later NICs support driver-level XDP execution, enabling high-throughput packet handling in native mode on Linux.[64] Hardware offload for XDP is not supported on BlueField-2 DPUs as of the latest available information (2023), with development ongoing.[64]
Other vendors include Broadcom's Stingray family of SmartNICs, which support XDP offload by running full Linux distributions on the device, facilitating eBPF program deployment for network functions.[65] Marvell OCTEON DPUs, such as those in the TX and 10 series, provide XDP and eBPF acceleration in configurations like Asterfusion's Helium SmartNICs, targeting security and load balancing workloads.[66] Software-based XDP support extends to virtualized environments via the virtio-net driver, available since Linux kernel 4.10 for both host and guest packet processing.[67]
On Windows, basic XDP functionality is available through the Windows Driver Kit (WDK) via the open-source XDP-for-Windows project, which implements a high-performance packet I/O interface similar to Linux XDP. Hardware offload is supported on select Azure NICs, such as NVIDIA Mellanox adapters in virtualized setups, though primarily optimized for Linux guests with experimental Windows extensions.[7]
To query XDP support and status on Linux, administrators can use [ethtool](/page/Ethtool) -l <[interface](/page/Interface)> to view channel configurations relevant to XDP multi-queue operations and [ethtool](/page/Ethtool) -S <[interface](/page/Interface)> for statistics including XDP drop counts.[68] For offload flags and parameters, devlink dev param show <device> displays hardware offload capabilities, such as XDP mode settings on supported NICs.
Applications
Use Cases
XDP has been widely deployed for DDoS mitigation, where it enables early packet dropping of malformed or suspicious traffic at the network interface level, often using the DROP action to discard packets before they consume kernel resources. Integration with intrusion detection systems like Suricata allows XDP to apply custom eBPF filters for real-time threat detection and blocking, as demonstrated in deployments throughout 2024 that handle high-volume attacks efficiently. For instance, Cloudflare's L4Drop tool leverages XDP to filter Layer 4 DDoS traffic, achieving rapid mitigation by processing packets directly in the driver.[69][70][71]
In load balancing and telemetry applications, XDP supports packet redirection to specific queues or devices using the REDIRECT action, facilitating efficient traffic distribution in containerized environments. Cilium, an eBPF-based networking solution for Kubernetes, employs XDP to accelerate service load balancing and enable flow sampling for monitoring, providing cluster-wide visibility into network traffic without kube-proxy overhead. This approach is particularly effective in dynamic cloud-native setups, where XDP programs dynamically update rules based on telemetry data to optimize routing and detect anomalies.[72][73]
For high-speed packet capture and forwarding, XDP combined with AF_XDP sockets enables user-space applications to bypass the kernel stack, serving as a foundation for tools that outperform traditional utilities like tcpdump. Cloudflare's xdpcap, for example, uses XDP to capture packets at line rate directly from the driver, supporting forwarding scenarios in monitoring and analysis pipelines. Red Hat's xdpdump further illustrates this by integrating XDP for efficient traffic examination in enterprise environments.[74][75]
XDP accelerates QUIC and HTTP processing by enabling receive-side scaling, distributing incoming connections across CPU cores for better throughput in modern web protocols. Microsoft's MsQuic implementation incorporates XDP to bypass the kernel for UDP packet handling, improving latency and scalability in high-performance networking stacks. Research on QUIC acceleration confirms XDP's role in offloading receive processing, making it suitable for edge computing and content delivery networks.[16][76]
Recent 2025 advancements highlight XDP's expanding versatility, such as a technique exploiting virtual Ethernet (veth) interfaces to apply XDP programs to egress traffic for shaping and rate limiting, previously limited to ingress paths. In DNS servers, the Name Server Daemon (NSD) integrates AF_XDP sockets to handle elevated query rates, enhancing protection against amplification attacks by enabling rapid filtering and processing of UDP traffic on port 53. As of Linux kernel 6.11 (September 2025), XDP includes improved multi-buffer support for AF_XDP, boosting performance in cloud-native environments.[20][77][78]
Enterprise adoption of XDP is evident in cloud providers, where it supports VPC traffic filtering through eBPF integrations for access control and flow optimization. AWS employs eBPF, including XDP capabilities via tools like Cilium, to enforce network security groups and tune VPC flows for enhanced observability and threat detection. Similarly, Google Cloud integrates XDP in Google Kubernetes Engine via Cilium, enabling efficient packet filtering and load balancing within shared VPC architectures.[79][80][81]
XDP programs demonstrate high throughput in packet drop operations, achieving up to 14.9 million packets per second (Mpps) per core on Intel Core i7 processors, as measured in 2019 tests on Linux 4.18 systems with simple eBPF filters.[82] With hardware offload to SmartNICs, performance scales to up to 18 Mpps, enabling efficient processing on 25 Gbps interfaces without host involvement.[83]
Comparisons highlight XDP's efficiency for filtering tasks, delivering 5-10 times higher throughput than nftables, with XDP sustaining up to 7.2 Mpps under heavy drop loads while nftables tops out at around 1.5 Mpps with minimal rules.[84] For user-space access via AF_XDP sockets, zero-copy mode reaches approximately 90% of line rate on high-speed links, compared to 50% with traditional copy-based sockets, by avoiding kernel-to-user data transfers.[42]
Key metrics include decision latencies under 1 μs for basic operations in the driver hook, though average forwarding latency measures around 7 μs at 1 Mpps loads.[10] CPU utilization remains below 5% when handling 40 Gbps traffic with multi-core scaling via Receive Side Scaling (RSS), allowing efficient resource use across cores.[82] Recent 2025 evaluations of userspace XDP (uXDP) implementations report up to 40% performance improvements over kernel-mode execution for certain network functions, such as load balancing, enhancing throughput for complex network functions.[19]
Performance testing commonly employs tools like TRex for generating high-volume traffic and pktgen for kernel-based packet injection, while xdp-bench provides detailed statistics on XDP program execution across modes.[10][85] Throughput scales linearly with the number of CPU cores and RSS queues, and hardware offload modes completely bypass host CPU cycles for processed packets.[10]
In mixed deployments, 2024 studies on AF_XDP confirm end-to-end delays below 10 μs when using busy polling and optimized socket parameters, supporting latency-sensitive applications.[86]