ftrace
ftrace is a framework of tracing utilities built directly into the Linux kernel for debugging and analyzing kernel performance issues, as well as tracing latencies and events outside of user space.[1] Introduced in Linux kernel version 2.6.28-rc2, ftrace was developed primarily by Steven Rostedt with contributions from Red Hat Inc. and VMware Inc., featuring dynamic function tracing from its inception along with subsequent enhancements in kernels such as 3.10 and 4.13 for improved function graph tracing and reduced overhead.[1] It operates via the tracefs filesystem, typically mounted at/sys/kernel/tracing, where users can configure and control tracing through files like current_tracer, available_tracers, and set_ftrace_filter.[1]
At its core, ftrace enables function tracing, which records the entry and exit of kernel functions using mechanisms like mcount or fentry calls, allowing developers to filter specific functions or modules with glob patterns for targeted analysis.[1] It also supports event tracing through hundreds of static tracepoints for kernel events such as scheduling switches or interrupts, accessible via the events directory in tracefs.[1] Additional components include latency tracers like irqsoff (for interrupt-off durations), preemptoff (for preemption-disabled periods), and wakeup tracers for real-time scheduling analysis, all leveraging a ring buffer to store timestamped trace data readable from files like trace or trace_pipe.[1]
ftrace's dynamic ftrace feature minimizes runtime overhead by patching function prologue calls to no-ops when tracing is disabled, enabling efficient toggling without recompiling the kernel.[1] Built on this foundation are specialized tracers such as function_graph (for visualizing call graphs with durations), stack tracer (for monitoring maximum stack usage), and hardware latency detector (hwlat) for identifying non-preemptible code paths.[1] Overall, ftrace provides a low-overhead, extensible toolset essential for kernel developers to diagnose issues like performance bottlenecks, real-time latencies, and system behavior in production environments.[1]
Overview
Definition and Purpose
ftrace is an internal tracer within the Linux kernel, designed to enable the tracing of kernel functions and events while imposing minimal performance overhead. It facilitates dynamic instrumentation, allowing developers to enable or disable tracing at runtime without the need to recompile the kernel. This framework provides a lightweight mechanism for capturing detailed execution data directly from the kernel.[2] The primary purpose of ftrace is to support kernel debugging, performance profiling, and latency analysis, particularly for understanding execution flows in critical areas such as task scheduling, interrupts, and I/O operations. It helps identify issues like excessive latencies between interrupt disable and enable events, preemption delays, or the time from task wakeup to scheduling. By focusing on events outside user-space, ftrace aids system designers in diagnosing problems that affect overall system behavior.[3][4] Initially centered on tracing function calls via mechanisms like mcount, ftrace has expanded to encompass broader kernel operations, including hundreds of static tracepoints for monitoring file systems, hardware events, and other subsystems. Developed by Steven Rostedt to meet the demand for efficient tracing in production environments, it offers a low-overhead alternative to heavier tools like SystemTap, ensuring negligible impact even when tracing is active.[3][5][6]Key Benefits
ftrace provides exceptionally low runtime overhead during tracing, achieved through optimized assembly-level hooks that minimize intrusion. This efficiency stems from its design, which allows tracing to be enabled or disabled dynamically with negligible impact, rendering it ideal for deployment on live production systems without significant disruption.[1] As a result, kernel developers and system administrators can perform diagnostics in real-time environments where even modest slowdowns would be unacceptable.[7] A key advantage of ftrace is its flexibility, supporting both static instrumentation compiled into the kernel at build time and dynamic tracing activated at runtime, all without necessitating kernel rebuilds or reboots.[1] This dual approach enables rapid adaptation to varying debugging needs, from broad function monitoring to targeted event observation, enhancing its utility across diverse kernel configurations.[8] ftrace delivers comprehensive tracing capabilities, including full stack traces, function call graphs, and event-specific probes, which facilitate in-depth analysis of kernel behaviors such as interrupt latencies and scheduler bottlenecks.[1] These features allow users to pinpoint performance issues and anomalies with precision, supporting diagnostics that would otherwise require more invasive methods.[9] The tool's accessibility further amplifies its value, as basic operations demand no specialized hardware or external userspace dependencies, relying solely on the kernel's built-in tracefs interface.[1] While optional userspace tools can augment visualization and analysis, the core functionality operates independently, broadening its applicability. In resource-constrained embedded systems, where heavier tracers like SystemTap may be impractical due to memory and CPU limitations, ftrace stands out for its lightweight footprint.[10] Its per-CPU ring buffer design bolsters scalability on multi-core architectures, ensuring consistent performance under high concurrency. As of November 2025, ongoing kernel developments include proposals to deprecate auto-mounting tracefs within debugfs for improved separation, and fixes for vulnerabilities such as CVE-2024-56569 affecting ftrace stability.[1][11][12]History
Origins
ftrace was developed by Steven Rostedt, a kernel developer at Red Hat, starting around 2007 as part of initiatives to bolster kernel debugging and performance analysis tools. The framework originated from components in the real-time Linux kernel patches, particularly latency tracing efforts led by Ingo Molnar, and built upon earlier work by Arnaldo Carvalho de Melo on mcount-based tracing code.[13] Initial copyrights reflect this timeline, with Rostedt's contributions dated from 2007 onward. The primary motivations stemmed from shortcomings in prevailing kernel tracing methods, such as kprobes, which introduced significant invasiveness through dynamic instrumentation and elevated runtime overhead, making them unsuitable for always-on or low-impact scenarios. ftrace addressed this by providing a built-in, efficient function tracer modeled after user-space profiling techniques, leveraging the established mcount mechanism from tools like gprof to enable lightweight call graph analysis without heavy modifications. This approach filled a critical void for non-intrusive, kernel-native tracing that could operate with minimal performance penalty when disabled.[2][14] Early prototypes focused on dynamic function graphing via mcount calls, compiled into the kernel using the GCC -pg flag to insert profiling hooks at function entries. These were rigorously tested through kernel patches in private development before broader exposure, emphasizing low overhead—demonstrated by benchmarks showing only about 13% slowdown in workloads like hackbench when tracing was inactive. The prototypes introduced mechanisms for enabling tracing via sysfs interfaces and outputting hierarchical call data, laying the groundwork for scalable kernel introspection.[14][2] The first public discussions of ftrace occurred on the Linux kernel mailing list (LKML) in January 2008, where Rostedt presented an RFC patch series titled "mcount tracing utility." This series highlighted the need for a lightweight, always-on tracing solution to pinpoint latency sources and debug kernel issues, sparking community feedback that refined the design prior to mainline integration.[14]Kernel Integration and Evolution
ftrace was integrated into the mainline Linux kernel with version 2.6.27, released on October 9, 2008, initially as the CONFIG_FUNCTION_TRACER option to enable basic function tracing capabilities.[15] This merge marked the transition from experimental development to a stable kernel feature, building on earlier work by Steven Rostedt to provide lightweight kernel instrumentation without requiring kernel recompilation.[16] Subsequent expansions enhanced ftrace's scope. In kernel 2.6.29, released in March 2009, graph tracing was added, allowing visualization of function call graphs to analyze execution paths and dependencies.[17] Subsequent kernels in the late 2.6 series added support for tracepoints to capture specific kernel events and latency tracers like irqsoff and preemptoff to measure critical path delays such as interrupt handling and preemption times.[18][19] Post-2016, optimizations focused on the ring buffer, improving lockless operations and timestamp accuracy to handle high-volume tracing with reduced overhead.[20] Over time, ftrace evolved from a function-only tracer to a versatile framework supporting ftrace's dynamic probe capability (merged in 2.6.33), a lower-overhead alternative to kprobes for kernel-space probing, and uprobes (introduced in 3.5 for user-space tracing).[21][22] It also extended to hardware events through integration with the perf subsystem, enabling combined software-hardware tracing. Recent developments from 2020 to 2025 have emphasized architectural portability, including enhanced ARM64 support for direct calls and trampolines to optimize BPF-based tracing.[23][24] By kernel 5.10, released in December 2020, ftrace had become foundational for advanced tools like ftrace-direct, which enables direct function calls within tracers for efficient, low-overhead attachments, particularly for BPF programs.[25] In kernel 6.14, released in January 2025, the function_graph tracer was enhanced to support multiple concurrent users, enabling fprobe integration for more efficient function entry/exit probing.[26] Ongoing maintenance is led by Steven Rostedt and the kernel tracing subsystem team, ensuring compatibility and performance refinements across kernel releases.Architecture
Core Components
The core of ftrace lies in its ring buffer, a high-performance circular data structure designed to capture and store trace events with minimal overhead. Implemented as per-CPU buffers, it allows each processor to write trace data atomically without needing global locks, thereby reducing contention and cache invalidation in multiprocessor environments. The buffer size is configurable, typically defaulting to 1 MB per CPU, and events are appended sequentially until the buffer wraps around, overwriting the oldest data to ensure continuous operation even under high load.[27] Trace points form another foundational element, serving as static instrumentation markers embedded directly into the kernel source code. These are defined using the TRACE_EVENT macro, which generates the necessary structures and functions to log predefined events such as system calls (e.g., sys_enter_open) or page faults without altering the original code's logic. When enabled, trace points invoke callbacks that serialize event data—including timestamps, CPU IDs, and parameters—directly into the ring buffer, enabling precise observation of kernel internals like interrupt handling or memory management operations.[28] Function entry and exit hooks provide dynamic tracing capabilities by intercepting function calls at runtime. These hooks leverage low-level assembly instructions, such as mcount on traditional architectures or the more efficient fentry on modern ones, inserted at the prologue and epilogue of eligible functions during kernel compilation. Upon invocation, they record entry timestamps and parameters or exit details, feeding this information into the ring buffer; dynamic ftrace further optimizes this by patching call sites in memory to enable or disable tracing without recompilation, supporting selective filtering of functions.[29] Control interfaces expose ftrace's functionality through the debugfs (or tracefs) virtual filesystem, mounted by default at /sys/kernel/debug/tracing (or /sys/kernel/tracing). This directory contains files like current_tracer for selecting active tracers, tracing_on for enabling/disabling capture, and trace for reading buffer contents, allowing runtime configuration without rebooting the system. Filters can be applied via set_ftrace_filter to target specific functions or events, integrating seamlessly with the ring buffer and hooks to manage data flow and prevent overload.[27] A critical safeguard in ftrace's design is the notrace annotation, which excludes designated functions—particularly those within the tracing infrastructure itself—from being hooked, thereby preventing infinite recursion that could crash the kernel. Introduced in early implementations to ensure stability, this feature is specified via the set_ftrace_notrace interface and applies to core components like the ring buffer management code.[30]Instrumentation Methods
ftrace achieves compile-time instrumentation by compiling the Linux kernel with the GCC-pg flag, which inserts calls to the mcount() function at the entry point of each instrumented function.[1] This generates a list of call sites recorded in the __mcount_loc section by the recordmcount tool during the build process.[1] Since GCC version 4.6, the -mfentry flag provides an alternative by inserting calls to __fentry__, which offers lower overhead due to its position before the function prologue and avoidance of stack manipulation.[1]
At runtime, ftrace enables dynamic instrumentation by patching the kernel's text section, replacing no-operation (NOP) instructions—inserted during compilation—with active calls to the tracing handler.[1] To ensure safety on multi-processor systems, this patching occurs atomically using the stop_machine() mechanism, which quiesces all CPUs, or more recent breakpoint-based methods that handle synchronization without full system stops.[1] This approach allows tracing to be enabled or disabled with minimal performance impact when inactive, as the NOPs impose negligible overhead.[1]
Filter mechanisms in ftrace reduce tracing noise and overhead through predicate-based selection. The set_ftrace_filter interface accepts function names, glob patterns (e.g., sched_*), or regex to limit tracing to specific kernel functions, while set_ftrace_notrace excludes unwanted ones.[1] For process-level control, set_ftrace_pid restricts tracing to threads matching listed process IDs, caching mappings for efficient runtime checks.[1]
ftrace integrates with dynamic probes via kprobes, enabling users to attach handlers to arbitrary kernel instructions without recompilation; these leverage ftrace's infrastructure for efficient entry-point hooking when CONFIG_KPROBES_ON_FTRACE is enabled.[31] Static tracepoints, predefined in the kernel source using the TRACE_EVENT macro, provide event-specific instrumentation points compiled directly into the code, accessible through the events tracefs directory for enabling and filtering.[1]
Following the development of live kernel patching around 2014, ftrace's dynamic patching capabilities have supported hot-swapping of kernel functions and tracers without requiring a system reboot, by redirecting calls to patched implementations via ftrace handlers.[31] In 2023, updates to ftrace on ARM64 architectures introduced per-callsite patching and optimized NOP insertion via GCC's -fpatchable-function-entry flag, reducing enable/disable overhead and kernel image size by up to 2% compared to traditional mcount methods.[32]
Features
Function Tracing
Function tracing in ftrace enables the monitoring of kernel function execution by capturing entry and exit events, providing essential insights into kernel behavior with minimal overhead. This capability relies on instrumentation hooks, such as mcount calls, inserted at function boundaries to log data into a per-CPU ring buffer.[33] The recorded information includes timestamps in microseconds, process IDs (PIDs), CPU identifiers, and execution durations, allowing developers to analyze function call sequences and performance characteristics.[2] In its basic operation, the function tracer (or enhanced variants like function_graph for exit details) invokes a callback on function entry and return, filtering traces based on enabled functions to avoid excessive logging. For instance, an entry event might appear asbash-1977 [000] .... 17284.993652: sys_close <-system_call_fastpath, showing the PID, CPU, flags, timestamp, and caller context, while exit events append duration metrics like 3.177 us.[2] This setup facilitates debugging and profiling by revealing call paths without requiring recompilation.[33]
Dynamic control of function tracing is managed through the /sys/kernel/tracing/set_ftrace_filter interface, which accepts regular expression patterns to selectively enable tracing for targeted functions. An example is filtering to scheduler functions with the pattern schedule*, which activates tracing only for matching names like schedule_timeout, thereby focusing output and mitigating performance impact.[2] This regex support, including glob-like wildcards, ensures precise selection from the pool of instrumented functions.[2]
Traces are accessible in raw text format via the trace_pipe file for live, streaming consumption or in binary format through trace_pipe_raw for efficient parsing by user-space tools.[2] Output records incorporate overhead metrics, such as per-function durations, to quantify execution time and identify latency sources.[2]
As the core of ftrace, function tracing was introduced in 2008 within Linux kernel version 2.6.27 to support real-time debugging needs.[34] It leverages hash-based filtering in dynamic ftrace for scalable management of the function address hash table.[33]
Graph and Latency Tracing
ftrace's function graph tracing extends basic function tracing by capturing both entry and exit points of functions, thereby recording caller-callee relationships to visualize execution paths. This tracer, known as the function_graph tracer, maintains a stack of return addresses for each task, enabling the reconstruction of call hierarchies. The output is typically presented in an ASCII art format, using brackets to denote nested calls—such as{ for entry and } for exit—along with duration measurements in microseconds, allowing users to identify bottlenecks in execution flows. For example, a trace might show a sequence like 0) 0.000 us | } 1.234 us | my_function(); highlighting the time spent within a function.[1]
Latency tracing in ftrace focuses on measuring the duration of time spent in critical kernel sections where responsiveness may be impaired, such as regions protected by preempt_disable(). Specialized tracers like irqsoff, preemptoff, and preemptirqsoff record the maximum latency during which interrupts or preemption are disabled, capturing the function call paths that contribute to these delays. These tracers generate detailed reports including timestamps, latency values (e.g., 259 µs for interrupt-off periods), and associated stack traces to pinpoint offending code. Histograms can be derived from the trace data to analyze distributions of IRQ-off times or softirq execution durations, aiding in the optimization of real-time systems. For instance, the preemptirqsoff tracer combines both preemption and interrupt disabling to reveal combined latencies up to several hundred microseconds in complex workloads.[1]
Stack tracing complements these features by capturing complete call stacks at designated trace points, which is particularly useful for diagnosing issues like memory leaks or deadlocks. Enabled through the stacktrace option or the dedicated stack tracer (requiring CONFIG_STACK_TRACER), it records the full backtrace when a trace event occurs, displaying the sequence of functions from the current point up to the entry point. This includes details such as stack depth and size, with outputs like => sys_read() <- do_sys_open() illustrating the call chain. In latency tracers, stack traces are automatically appended to high-latency events, providing context for analysis. The stack tracer also monitors maximum stack usage across function calls, alerting on potential overflows.[1]
ftrace's latency capabilities were notably integrated with Latencytop, a tool for identifying latency sources in user-space and kernel activities, around 2009 to enable kernel-side sampling without significant overhead. Recent enhancements, including support for the x86 TSC (Time Stamp Counter) clock, provide hardware-based timestamping for sub-microsecond precision on x86 architectures, improving the accuracy of latency measurements across multi-CPU systems.[1][20]
Usage
Enabling and Configuration
To enable ftrace, the Linux kernel must first be compiled with appropriate configuration options. The core tracing infrastructure is activated by setting CONFIG_TRACING=y in the kernel configuration file (.config), which enables the overall tracing support including tracefs. For function tracing specifically, CONFIG_FUNCTION_TRACER=y must be selected under the "Kernel hacking" menu in tools like make menuconfig; this inserts mcount calls or equivalent instrumentation into kernel functions during compilation. Additional options such as CONFIG_DYNAMIC_FTRACE=y allow for runtime enabling and disabling of tracing with minimal overhead by converting no-op instructions to calls dynamically. These settings ensure ftrace is available post-boot without requiring module loading, as it is built into the kernel.[1][35] At runtime, ftrace is accessed via the tracefs filesystem, though for compatibility with older kernels (pre-4.1), debugfs can be mounted to expose the tracing interface. To set up, mount debugfs at /sys/kernel/debug with the commandmount -t debugfs none /sys/kernel/debug, which automatically makes tracefs files available under /sys/kernel/debug/tracing. Alternatively, directly mount tracefs at /sys/kernel/tracing using mount -t tracefs nodev /sys/kernel/tracing for modern kernels. Once mounted, tracing can be globally enabled by writing 1 to the tracing_on control file: echo 1 > /sys/kernel/debug/tracing/tracing_on (or the equivalent path in tracefs). This starts capturing trace data into per-CPU ring buffers; to disable, write 0 instead.[1]
Function filtering refines what is traced to reduce overhead and focus on relevant code paths. The file available_filter_functions lists all kernel functions eligible for tracing, which can be viewed with cat /sys/kernel/debug/tracing/available_filter_functions. To enable tracing only for specific functions, write their names (supporting glob patterns) to set_ftrace_filter, e.g., echo do_sys_open > /sys/kernel/debug/tracing/set_ftrace_filter. Functions to exclude are specified via set_ftrace_notrace in a similar manner, such as echo do_exit > /sys/kernel/debug/tracing/set_ftrace_notrace. Clearing a filter is done by writing an empty string to the respective file. These filters apply globally unless instance-specific tracing is configured.[1]
Buffer management controls the storage for trace data, preventing overflow and ensuring sufficient capacity. The per-CPU trace buffer size is adjusted in kilobytes via buffer_size_kb, for example, echo 4096 > /sys/kernel/debug/tracing/buffer_size_kb to set 4 MB per CPU. The total buffer size across all CPUs can be queried with cat /sys/kernel/debug/tracing/buffer_total_size_kb. To clear the buffers and reset tracing without stopping it, write to the trace file: echo > /sys/kernel/debug/tracing/trace. Sub-buffer sizes can also be tuned with buffer_subbuf_size_kb for finer control over ring buffer allocation.[1]
Since Linux kernel version 3.0, boot-time configuration of ftrace buffers is supported via kernel command-line parameters. The trace_buf_size option sets the initial buffer size per CPU, e.g., appending trace_buf_size=10M to the bootloader command line (like in GRUB's linux line) allocates 10 MB buffers early in boot. This allows tracing from initramfs stages onward without manual runtime setup. Since Linux 6.12, the tracing ring buffer can be allocated in reserved memory that persists across reboots using kernel command-line parameters such as reserve_mem=size:offset:trace and trace_instance=boot_map@address:size. This enables access to trace data from the previous boot via /sys/kernel/tracing/instances/boot_map/trace, provided the kernel layout is consistent and the reserved memory is retained by the hardware.[36][37]
Tools and Interfaces
ftrace provides several userspace tools and interfaces for capturing, analyzing, and visualizing tracing data from the kernel's tracefs filesystem, typically mounted at/sys/kernel/tracing. One of the simplest built-in interfaces is reading the /sys/kernel/tracing/[trace](/page/trace) file using cat, which outputs the entire contents of the trace buffer in a human-readable format, including timestamps, process IDs, and function entries or events. This method allows users to view full logs without consuming the buffer data, making it suitable for post-capture inspection; however, to clear the buffer after reading, the file can be opened with the O_TRUNC flag.[1]
For more advanced scripted recording and replay, the trace-cmd utility serves as a powerful front-end to ftrace, enabling automated capture of traces across multiple events and CPUs. Developed by Steven Rostedt in 2010, trace-cmd supports commands like record to start tracing with specified filters, start and stop for controlling ongoing sessions, and report for replaying saved traces from binary .dat files. It facilitates integration with monitoring tools.[38]
Visualization tools enhance the interpretability of ftrace data beyond plain text logs. KernelShark offers a graphical interface for analyzing trace files generated by trace-cmd, displaying timelines, event correlations, and zoomable views of function calls and latencies to aid in debugging complex kernel behaviors. Additionally, the function_graph tracer's output can be processed with scripts to generate DOT files for rendering call graphs using Graphviz, providing a directed graph representation of function entry and exit points.[39][40]
Advanced utilities extend ftrace's utility for performance profiling, such as integrating kernel traces with flame graphs for stack trace visualization. Tools like Brendan Gregg's FlameGraph scripts can process ftrace function or graph tracer outputs—often via intermediate steps with perf or custom parsers—to produce interactive SVG flame graphs that highlight hot code paths and call hierarchies, emphasizing cumulative execution time without delving into full event listings.[41]
Integrations
With perf
The perf tool integrates with ftrace by utilizing its tracepoints and function hooks as a backend to capture and record kernel events, enabling detailed software tracing without requiring separate configurations. For example, the commandperf trace -e sched:* leverages ftrace's scheduling tracepoints to display kernel events related to task scheduling in real-time, providing insights into context switches and latencies. This mechanism allows perf to access ftrace's ring buffers directly through the tracefs interface, synchronizing timestamps via a shared clock to ensure accurate event ordering.[1]
perf extends ftrace's capabilities by combining its data with sampling-based profiling; for instance, perf record can capture ftrace tracepoints alongside call stack samples, facilitating analysis of both event flows and hot code paths in a single session. The perf script utility further supports replaying these traces, allowing users to filter, aggregate, and visualize ftrace events in conjunction with perf's sampled data for comprehensive kernel behavior reconstruction.
This integration offers key advantages by augmenting ftrace's static software tracing with perf's dynamic features, such as userspace function tracing via uprobes, hardware performance counter monitoring (e.g., CPU cycles and cache misses), and advanced event filtering based on criteria like process ID or CPU.[42] As a result, users gain a unified profiling framework that correlates low-level kernel traces with system-wide metrics, improving diagnostics for performance bottlenecks. The foundational support for perf to utilize ftrace function tracing was merged in Linux kernel 3.3, building on earlier tracepoint interactions introduced around kernel 2.6.31.[43] In kernel 4.11, the dedicated perf ftrace subcommand was added as a user-friendly wrapper, simplifying access to ftrace's function graphing and latency tracing modes.[44]
With eBPF
eBPF programs can attach to ftrace's static tracepoints, which are predefined instrumentation points in the kernel code that leverage ftrace's infrastructure for efficient event recording. These attachments enable programmable extensions to ftrace tracing, allowing developers to implement custom logic for filtering events, aggregating data, or triggering actions directly in kernel space without modifying the kernel source. Tools such as bpftrace and libbpf simplify this process by compiling high-level scripts into eBPF bytecode and handling the attachment to tracepoints via the BPF_PROG_TYPE_TRACEPOINT program type. This integration became available starting with Linux kernel 4.7, enhancing ftrace's capabilities for dynamic, low-overhead observability.[45][1] A key advancement in ftrace-eBPF synergy is the ftrace-direct mechanism, which permits eBPF programs to hook into kernel functions using ftrace's dynamic instrumentation without relying on pre-existing tracepoints. This feature employs BPF trampolines, where the kernel patches the target function's entry point to directly invoke the eBPF handler, minimizing indirection and reducing tracing overhead compared to traditional kprobes or indirect calls. Program types like BPF_PROG_TYPE_FENTRY and BPF_PROG_TYPE_FEXIT utilize this for entry and exit tracing, with support indicated by a 'D' flag in ftrace's enabled_functions file when an eBPF trampoline is active. Initially developed for x86 architectures, ftrace-direct support extended to arm64 in kernel 6.3, enabling broader use of eBPF for function-level tracing across platforms.[1][23] Hybrid approaches combine ftrace's per-CPU ring buffers with eBPF's map structures to enable sophisticated tracing workflows, such as aggregating statistics in kernel space before outputting to user space. For instance, an eBPF program attached to an ftrace tracepoint can increment counters in a BPF_MAP_TYPE_HASH or BPF_MAP_TYPE_PERCPU_ARRAY map for real-time metrics collection, while simultaneously writing detailed events to ftrace's ring buffer (accessible via /sys/kernel/tracing/trace) or eBPF's own BPF_MAP_TYPE_RINGBUF for structured data export. This setup supports user-kernel tracing pipelines, where eBPF processes raw ftrace data for filtering or summarization, reducing user-space overhead and enabling features like histogram generation or conditional logging. Such combinations are common in tools from the BCC (BPF Compiler Collection) suite, where eBPF enhances ftrace's raw event streaming with programmable data paths.[1][46] ftrace-eBPF integration has evolved significantly since its early stages in the Linux 4.x series, with unprivileged eBPF support introduced in kernel 4.4 laying groundwork for safer program loading, though full tracing attachments matured in subsequent releases like 4.7.[46][47][48]Limitations
Performance Overhead
ftrace incurs performance overhead primarily from CPU cycles expended by tracing hooks and memory allocated for its ring buffers. When tracing is enabled without filters, function hooks can introduce noticeable CPU overhead, with benchmarks showing latency increases of up to 12% even in minimal configurations like the NOP tracer.[49] Overhead can vary significantly depending on the configuration, load, and filtering, ranging from negligible when inactive to substantial for unfiltered tracing under high loads.[50] [51] Memory overhead stems from per-CPU ring buffers, which default to approximately 1 MB per CPU after initial expansion from a minimal size of a few pages, scaling with system core count and configurable via thebuffer_size_kb file.[52] [1]
Overhead can be measured using the per-CPU statistics available in /sys/[kernel](/page/Kernel)/debug/tracing/per_cpu/cpuX/[stats](/page/List_of_Arizona_wildfires), which report trace entries, overruns, and other metrics to quantify buffer pressure and processing costs. Dynamic tracers, such as the function graph tracer, may add additional overhead during high-load operations due to increased instrumentation and data collection. Latency-sensitive tracers like irqsoff or preemptoff provide further insights via tracing_max_latency, capturing maximum delays in microseconds attributable to tracing activity.
Several mitigations exist to minimize ftrace's impact. Filtering traced functions or events—via files like set_ftrace_filter and set_ftrace_notrace—significantly reduces the volume of data processed, lowering CPU usage by limiting hook invocations.[1] [50] Per-CPU ring buffers enable atomic, contention-free writes, avoiding global locks that could amplify overhead in multi-core systems.[1] Additionally, disabling unused tracers by selecting the nop tracer or toggling tracing_on to zero halts buffer writes while retaining configuration, ensuring near-zero runtime cost when tracing is inactive.[1] [54]
Benchmarks demonstrate that ftrace exhibits less than 1% overhead on idle systems when properly filtered and inactive, making it suitable for production environments with intermittent use. As of 2025, ongoing kernel developments, such as those for arm64, have further optimized per-callsite tracing to reduce overhead in multi-tracer environments.[32]
Alternatives and Comparisons
ftrace offers a lightweight, kernel-integrated approach to function and event tracing, distinguishing it from more scripting-oriented tools like SystemTap. While SystemTap provides flexible scripting capabilities for custom probes across kernel and user-space, it incurs higher overhead due to the need to compile and load kernel modules for each script.[55][56] In contrast, ftrace leverages built-in kernel interfaces, such as those under/sys/[kernel](/page/Kernel)/debug/tracing/, enabling low-overhead tracing without additional module loading, making it preferable for quick, basic kernel diagnostics where scripting complexity is unnecessary.[55]
Compared to LTTng, ftrace emphasizes simplicity for kernel-specific traces, such as function graphs and latency measurements. While both support circular buffering via ring buffers, LTTng offers more advanced configurability, such as multiple channels and sub-buffers, for comprehensive system tracing, supporting both kernel and user-space events with tools like Trace Compass for visualization, which suits scenarios requiring detailed application-level analysis over ftrace's kernel-focused scope.[57] For instance, LTTng's multi-component tracing makes it ideal for virtualized workloads, whereas ftrace suffices for bare-metal kernel debugging with minimal setup.[57]
ftrace's Linux-native integration and open-source availability position it favorably against DTrace, which originated in Solaris and BSD systems and requires ports like DTrace for Linux to function.[58] While DTrace offers robust, scriptable probes for system-wide observability, its non-native status on Linux introduces compatibility challenges and lacks the seamless kernel embedding of ftrace.[58] Users often select ftrace for cost-free, straightforward Linux kernel probing, reserving DTrace ports for environments needing its mature scripting ecosystem.[59]
eBPF has emerged as a programmable successor to ftrace for complex tracing needs, building on ftrace's foundational tracepoints and kprobes to enable sandboxed kernel programs without source modifications.[60] This evolution allows eBPF to handle advanced observability and networking tasks, while ftrace remains the core layer for efficient, low-level kernel hooks.[60] Recent analyses highlight ftrace's prevalence in kernel debugging, underscoring its role alongside integrations like perf and eBPF for broader tracing workflows.[61]