Fact-checked by Grok 2 weeks ago

Slab allocation

Slab allocation is a management mechanism designed to efficiently handle the allocation and deallocation of small, frequently requested objects by maintaining caches of pre-initialized slabs, where each slab consists of one or more contiguous pages divided into fixed-size object slots. This approach minimizes internal fragmentation, reduces the overhead of object initialization on each allocation, and optimizes utilization through techniques like slab coloring, which aligns objects to avoid cache line sharing across slabs. By reusing objects from full, partial, or empty slabs within per-object-type caches, slab allocation enables rapid access to ready-to-use structures such as inodes or descriptors, making it particularly suited for high-frequency, short-lived allocations in operating system . The slab allocator was originally developed by Jeff Bonwick for the 5.4 kernel, as detailed in his 1994 USENIX paper, where it was introduced as an object-caching system to address inefficiencies in traditional memory allocators like the binary buddy system. Bonwick's design emphasized caching frequently used objects in a warm, initialized state to eliminate repetitive setup costs and fragmentation from variable-sized allocations. This innovation was later extended in a 2001 paper by Bonwick and Jonathan Adams, which incorporated magazines for multiprocessor scalability and vmem for managing arbitrary resource arenas beyond physical memory. In the , slab allocation was adapted starting with version 2.2 in 1999, drawing directly from Bonwick's original concepts to provide a general-purpose allocator for kernel objects. originally implemented three variants to suit different system needs: the SLAB allocator, which uses fine-grained slab lists and per-CPU queues for low-latency access; SLUB, a simplified and scalable version that merges slabs into per-CPU partial lists to reduce metadata overhead and improve concurrency on multi-core systems, which became the default in kernel 2.6.23; and SLOB, a minimalistic option for resource-constrained environments that treats as a simple of blocks without dedicated slab structures. However, SLOB was removed in kernel 6.4 (June 2023) and SLAB in kernel 6.8 (March 2024), leaving SLUB as the sole general-purpose slab allocator. Key APIs include kmem_cache_create() for initializing caches, kmem_cache_alloc() and kmem_cache_free() for object handling, and kmem_cache_destroy() for cleanup, all of which support flags for behaviors like high- allocation or debugging. These developments enhance kernel performance by cutting allocation times—often to near-constant O(1) complexity—and mitigating issues like hotplug and NUMA awareness in modern hardware.

Overview and History

Definition and Purpose

Slab allocation is a memory management technique used in operating system kernels to efficiently handle the allocation and deallocation of fixed-size kernel objects. It operates by pre-allocating contiguous blocks of memory known as slabs, which are subdivided into fixed-size chunks tailored to specific object types, thereby enabling the reuse of these objects without the overhead of repeated memory searches or reinitialization. The primary purpose of slab allocation is to minimize the time and space costs associated with frequent object creation and destruction, particularly for complex structures that require initialization of embedded components such as locks or reference counts. By implementing object caching, it retains the state of allocated objects between uses, allowing for rapid servicing of allocation requests from a pre-constructed rather than invoking costly dynamic memory operations each time. This approach significantly enhances performance; for instance, the allocation time for stream head objects in 5.4 was reduced from 33 µs to 5.7 µs on a SPARCstation-2. In its high-level workflow, slab allocation maintains dedicated for each object type. When a request for an object arrives, it is served directly from the corresponding if available; otherwise, a new slab is allocated from the kernel's , populated with the required objects, and added to the . Deallocation simply returns the object to the without destruction, preserving its initialized for future reuse, while reclamation mechanisms destroy and free slabs only under pressure. This method is particularly suited to kernel environments with recurring allocations of similar objects, such as inodes for management or task structures for process handling, where grouping allocations by type optimizes both spatial locality and initialization efficiency.

Historical Development

Slab allocation was introduced by Jeff Bonwick at in 1994 as part of the memory allocator for SunOS 5.4, which corresponds to 2.4. This design replaced the previous System V Release 4 (SVr4)-based allocator, aiming to improve efficiency for object management by using object-caching primitives that minimize initialization overhead and fragmentation. The allocator addressed key limitations of buddy allocators, which were common in Unix kernels of the era, such as external fragmentation and the high cost of object construction and destruction for frequently allocated kernel structures. Its influence spread to other systems in the late ; for instance, incorporated a zone-based allocator in version 3.0 (released in 1998), which was later enhanced to provide slab-like functionality in version 5.0 (2003). In the , the SLAB implementation—directly inspired by Bonwick's work—was integrated starting with Linux 2.2 in 1999, enhancing performance over the prior K&R-style allocator. Subsequent evolutions in Linux included the SLOB variant, a lightweight implementation suited for embedded systems with constrained memory, introduced in 2005. To address scalability issues on multi-processor systems, the SLUB allocator was developed as a simpler, unqueued alternative and became the default in Linux 2.6.23 in October 2007, offering better performance by reducing locking overhead and per-CPU caching. SLUB also incorporated support for huge pages to optimize memory usage in large-scale environments. As of 2025, SLUB remains the primary slab allocator in the , with the original SLAB variant fully removed in kernel version 6.8 (late 2023) and SLOB removed in kernel version 6.4 (mid-2023). Ongoing optimizations focus on multi-core scalability and integration with modern hardware features like huge pages, but no fundamental overhauls have occurred since SLUB's adoption.

Motivations and Problems Addressed

Memory Fragmentation in Traditional Allocators

Traditional kernel memory allocators, such as the commonly used in operating systems, suffer from external fragmentation, where free memory becomes scattered into small, non-contiguous blocks over time due to repeated allocations and deallocations. This scattering prevents the satisfaction of requests for large contiguous regions, even when the total free memory is sufficient, as the merges only adjacent blocks of equal size (powers of two) but fails to coalesce disparate fragments efficiently under heavy workloads. In kernel environments, this issue is exacerbated by long-lived allocations for structures like page tables or I/O buffers, leading to allocation failures and the need for costly memory compaction or reclamation processes. Internal fragmentation in these allocators arises from the allocation of fixed-size blocks that exceed the requested size, resulting in wasted space within each allocated unit. The system's power-of-two block sizes mean that a request slightly larger than half a block size receives the next larger power-of-two block, potentially wasting up to 50% of the space in the worst case, with an expected waste of around 25-28% across allocations. For small objects, such as descriptors or inodes typically under 1 , traditional allocators often rounded up to full page sizes (e.g., 4 ), amplifying internal waste to as much as around 60% per allocation and contributing to overall inefficiency. In high-load kernel scenarios, like those in early and BSD systems, these fragmentation types combined to waste 40-50% of available , as observed in benchmarks such as the kenbus where traditional power-of-two allocators in SVr4 exhibited 46% fragmentation. Frequent small, fixed-size allocations for kernel data structures, such as task control blocks or objects, intensified the problem by creating numerous partially used blocks, increasing allocation failure rates and overhead from frequent searches through free lists. Compared to user-space general-purpose allocators like malloc, which handle variable-sized requests with techniques like binning to mitigate fragmentation, kernel allocators face heightened challenges due to the predictable yet frequent demand for fixed-size objects in a , non-garbage-collected environment, making fragmentation more detrimental to system stability and performance.

Overhead of Object Initialization

In traditional memory allocators, such as those based on sequential-fit or systems, the initialization of each newly allocated object incurs significant computational overhead. This typically involves zeroing out the allocated to ensure and prevent leakage, initializing object-specific fields like locks, pointers, and counters, and linking the object into relevant kernel structures such as lists or hash tables. These steps can consume thousands of CPU cycles per object, often exceeding the costs of the underlying memory allocation itself, as measured in pre-slab implementations where object setup dominated performance profiles. Deallocation introduces comparable overhead through a reverse set of operations, including unlinking the object from data structures, resetting fields to a safe state, and in buddy-based systems, attempting to coalesce freed blocks with adjacent buddies to combat fragmentation. This coalescing step requires searching for matching free blocks and merging them, which adds variable depending on the and can lead to spikes during allocation bursts when multiple objects are freed concurrently. While fragmentation represents a related issue of spatial inefficiency, the runtime costs of these initialization and deallocation routines primarily manifest as temporal delays in object lifecycle management. A concrete example of this overhead arises in process creation operations, such as the fork() system call in Unix-like kernels, where structures like the task_struct in Linux must be allocated and fully initialized for each new task. This includes copying parent process state, setting up scheduling parameters, and initializing security contexts, which collectively bottleneck system performance under high load, as repeated init/deinit cycles amplify delays in multi-process environments. Kernel objects such as inodes for file system operations or semaphores for synchronization are often short-lived yet requested at high frequency, leading to cumulative overhead that degrades overall throughput in demanding workloads. Prior to the , early designs lacked dedicated object pools or caching mechanisms, relying instead on ad-hoc general-purpose allocators like simple malloc/free implementations integrated with or sequential-fit heuristics. These approaches exacerbated initialization costs in multi-user systems, where concurrent allocations of complex objects—such as or tasks—resulted in unacceptable , prompting the development of more efficient strategies to mitigate such inefficiencies.

Core Concepts

Caches and Slabs

In slab allocation, a serves as the primary organizational unit for managing objects of a specific type, such as structures like task_struct or inodes. Each , often implemented as a kmem_cache in systems like , oversees the lifecycle of multiple slabs dedicated to that object type, maintaining global statistics including total object usage, allocation counts, and configurable growth limits to control expansion. Caches provide a centralized for allocation and deallocation requests, ensuring that objects are pre-initialized and readily available to minimize overhead. This design allows for type-specific optimizations, where clients create caches via primitives like kmem_cache_create, specifying parameters such as object size, alignment requirements, and optional constructor functions for initialization. A slab, in contrast, represents a contiguous block of memory pages—typically one or more 4 KB pages in common implementations—partitioned into fixed-size slots tailored to hold objects of a single cache's type. Each slab includes embedded metadata for tracking object availability, such as freelist pointers and usage counters, which enable efficient navigation without external data structures. For instance, a slab for 400-byte objects uses a single 4 KB page to store 10 such objects, resulting in approximately 2.4% internal fragmentation. Slabs are sourced dynamically from a backing memory allocator, such as the buddy system, when a cache requires additional capacity, ensuring they align with the system's page granularity for minimal waste. The relationship between caches and slabs is hierarchical and list-based: a cache maintains separate queues of slabs categorized by their state—full (no free objects), partial (some free objects), and empty (all objects free)—with allocation requests preferentially serviced from partial slabs to maximize reuse and reduce fragmentation. This organization allows caches to grow or shrink by adding or removing slabs as demand fluctuates, while empty slabs can be returned to the backing allocator for reclamation. Object states within slabs, such as allocated or free, are managed internally but contribute to the slab's overall categorization in the cache.

Object Allocation States

In the slab allocation mechanism, objects within a slab progress through distinct lifecycle states that facilitate efficient reuse and minimize initialization overhead. The primary states are unused (free slots available for allocation), allocated (currently in use by the ), and partially initialized (pre-configured with common default values at the slab level). Unused objects reside on a free list within their slab, ready for immediate assignment without full reinitialization, while allocated objects are actively referenced by components. The partially initialized state applies to objects that have undergone batch initialization via a constructor during slab creation, setting shared attributes like zeroed fields or standard to enable rapid deployment. Transitions between these states are designed for speed and simplicity. Upon allocation, an unused or partially initialized object is transitioned to the allocated with minimal additional setup, such as linking it to the requesting subsystem, avoiding per-allocation constructors. Deallocation reverses this by marking the object as unused, returning it to the free list without invoking a full destructor to preserve its partially initialized form for future use. This approach contrasts with traditional allocators by retaining object across cycles, reducing from costly zeroing or custom setup operations. The benefits of these states stem from the use of optional constructor (ctor) and destructor (dtor) hooks, which are invoked only during slab growth or shrinkage rather than individual allocations. The ctor pre-initializes all objects in a new slab with common defaults, such as clearing sensitive fields, while the dtor performs cleanup only when a slab is fully emptied and returned to the page allocator, ensuring resources like locks are released. This selective invocation cuts initialization costs significantly—for instance, in the original , it reduced head allocation time from 33 μs to 5.7 μs by caching initialized states. By maintaining partially initialized objects, the system avoids redundant zeroing for frequently allocated structures, enhancing throughput in high-demand scenarios. Tracking these states occurs at the object slot level within the slab structure. For small objects, a or pointer (such as a freelist link) indicates the state, utilizing unused space to avoid overhead; larger objects may use a separate control to map indices to free or allocated status. This lightweight mechanism ensures constant-time state queries and updates, integrating seamlessly with the enclosing and slab descriptors.

Implementation Details

Slab Creation and Management

Slab creation in the slab allocator is triggered when a requires additional objects and has no available partial or full slabs with free space. In this scenario, the allocator invokes a growth , such as kmem_cache_grow() in implementations like , to request pages from the underlying page allocator (e.g., the ). These pages are then divided into fixed-size objects matching the cache's specifications, with structures initialized to track object locations and states. If a constructor is provided during cache setup, it is executed for each object to perform initialization, ensuring objects are in a ready state without per-allocation overhead. Management of slabs within a involves maintaining lists to categorize them by utilization: full, partial, and . Policies enforce limits on the total number of slabs per to avoid unbounded consumption, often by capping the number of objects or slabs based on system constraints. Shrinking occurs under pressure, where empty () slabs are identified and their returned to the page allocator via functions like kmem_cache_shrink(), reclaiming contiguous blocks. Growth strategies typically begin with a minimum number of slabs upon creation and expand incrementally on demand, such as when the cache's partial slabs are depleted, to balance responsiveness and resource use. Post-creation, objects within new slabs start in a state, ready for allocation. Error handling during slab creation addresses failures in page allocation, often due to out-of-memory (OOM) conditions. If the page allocator cannot fulfill the request, the operation fails gracefully, with the logging warnings via mechanisms like printk() for debugging killers or resource exhaustion. In such cases, higher-level allocators like kmalloc() may fall back to alternative strategies, such as using larger general-purpose pools, to ensure system stability. For example, in the , the kmem_cache_create() function establishes cache parameters including size, alignment, and constructor, while object allocation routines first attempt to retrieve from partial slabs via get_partial() and, if unsuccessful, trigger new_slab() to create a new one.

Allocation and Deallocation Process

In slab allocation, the process of requesting an individual object from a begins by searching the free list of partial slabs within the . In implementations with per-CPU s, such as the , available objects are first checked from the local per-CPU to minimize contention in multiprocessor environments. Should partial slabs also lack objects, a new slab is created and populated with initialized objects, as detailed in slab creation procedures. The selected object is then marked as allocated, and its pointer is returned to the requester. Deallocation reverses this flow by first validating the object's address and affiliation to prevent errors. In some implementations like , if a destructor is registered for the object type, it is invoked to clean up resources before freeing. The object is subsequently added to the appropriate free list, typically the per-CPU freelist for quick reuse or the partial slab's list otherwise. If the slab becomes entirely empty after deallocation, it is marked for potential reclamation under memory pressure, though it may remain in the 's free slab list for future allocations. To manage concurrency, slab allocators employ cache-wide locks, such as spinlocks, when accessing shared free lists in partial or full slabs. These locks are minimized in per-CPU designs, where local caches allow lock-free operations for most allocations and deallocations on the same . The core algorithms can be outlined in as follows: Allocation Pseudocode:
kmem_cache_alloc(struct kmem_cache *cache, int flags) {
    if (free_object = pop_from_percpu_freelist(cache)) {
        return free_object;
    } else if (free_object = pop_from_partial_slab_freelist(cache)) {
        return free_object;
    } else {
        create_new_slab(cache);
        free_object = pop_from_new_slab_freelist(cache);
        return free_object;
    }
}
Deallocation Pseudocode:
kmem_cache_free(struct kmem_cache *cache, void *object) {
    validate_object(cache, object);
    if (destructor) destructor(object);
    obj_index = compute_index_in_slab(object);
    push_to_freelist(cache, obj_index);  // To per-CPU or partial slab
    if (slab_is_empty(cache, slab_of(object))) {
        mark_slab_for_reclamation(slab_of(object));
    }
}
These processes achieve average O(1) due to direct access to freelists and avoidance of linear searches, enabling efficient handling of frequent small-object allocations in environments.

Advanced Techniques

Slab Coloring

Slab coloring is a used in the slab allocator to optimize processor performance by distributing object es evenly across lines, thereby reducing and conflicts among concurrently accessed objects. In this approach, each slab is assigned a unique offset, or "color," from its page-aligned base , which shifts the starting positions of objects within the slab. This prevents multiple unrelated objects from mapping to the same line, which could otherwise lead to thrashing and increased miss rates in multi-threaded environments like operating system . The method addresses the limitations of traditional power-of-two allocators, which often align objects poorly with geometries, leading to suboptimal utilization. Implementation involves calculating the color during slab creation, where the offset is chosen from a sequence of values that fit within the unused space of the slab page. For instance, with 200-byte objects allocated from 4KB pages on systems with 8-byte cache line granularity, colors range from 0 to 64 bytes in 8-byte increments, ensuring that successive slabs use different offsets to spread objects across cache lines. The maximum number of colors is limited by the slab's free space after accounting for object sizes and metadata, and the allocator cycles through these colors for new slabs in a cache. This padding introduces minimal overhead since colors exploit the natural slack in page-sized slabs. The benefits of slab coloring include significant improvements in cache hit rates and overall system throughput, particularly on multiprocessor systems. On the SPARCcenter 2000, it reduced primary cache miss rates by 13% and improved bus balance from 43% to 17% imbalance, while benchmarks showed 5% fewer primary cache misses during parallel builds. These gains stem from better cache line utilization and reduced memory traffic concentration, making it especially effective for workloads with high contention. Slab coloring was introduced in the original 5.4 slab allocator design, tailored for processors to enhance multiprocessor scalability.

Per-CPU Caches

In multi-processor systems, per-CPU caches in slab allocators address issues arising from global lock contention during allocation and deallocation operations. Each CPU maintains a small, local consisting of partial slabs or free objects, which is replenished from a global depot or only when depleted. This design ensures that most operations occur locally without acquiring shared locks, thereby minimizing inter-CPU communication and cache line bouncing. The mechanism employs a layered approach where the per-CPU acts as a fast-access , often implemented as a —a fixed-size or of object pointers per CPU—stocked with pre-allocated items from larger slabs. For allocation, the requesting CPU first attempts to pop an object from its local magazine; if empty, it swaps with a secondary local buffer or fetches a full magazine from the global depot using atomic operations like (cmpxchg) to avoid locks. Deallocation pushes objects back to the local magazine, with overflow items returned to the depot only on imbalance, such as when a CPU's exceeds or detects uneven distribution across processors. Object between CPUs is rare and triggered solely by such imbalances, preserving locality. This approach yields significant performance benefits, reducing the time spent holding global locks from microseconds to nanoseconds per operation and enabling linear scalability with the number of cores, as demonstrated in benchmarks showing doubled throughput on multi-CPU systems under high load. By localizing access, it also lowers remote and miss rates, with empirical results indicating miss rates bounded by the inverse of the magazine size. Tuning of per-CPU caches balances efficiency against memory overhead, typically limiting the cache to 1-2 partial slabs or a small number of objects (e.g., 6-120 depending on object size) per CPU to prevent excessive remote accesses or wasted space on idle processors. Dynamic adjustment of cache sizes, based on contention metrics, further optimizes usage without manual intervention. In the SLUB allocator, per-CPU caches are realized through partial lists embedded in the kmem_cache_cpu structure, where each CPU holds a freelist of objects from an active slab for lockless fast-path operations on supported architectures. When local partial lists overflow or deplete, objects are migrated remotely via exchanges, integrating seamlessly with the broader allocation process. As of October 2025, the (version 6.18) introduced Sheaves, an opt-in per-CPU array-based caching layer for the SLUB allocator that replaces traditional CPU partial slabs. This enhancement aims to reduce overhead in per-CPU operations and improve overall .

Free List Management

In slab allocators, free list management involves tracking and linking unallocated objects within slabs to enable rapid reuse during allocation requests. Each slab maintains its own freelist, which serves as a of available objects, ensuring that allocations and deallocations operate in constant time by simply adjusting pointers and reference counts. This approach preserves object initialization and reduces fragmentation by keeping related objects grouped and ready for immediate use. For small objects—typically those smaller than half the slab size—the freelist pointer is embedded directly within the object itself, often at the end of the , pointing to the next free object in the list. This minimizes metadata overhead by repurposing space in the object for linkage, such as using a kmem_bufctl structure that includes the freelist pointer and . In contrast, larger objects employ a separate , such as an array of indices (kmem_bufctl_t in SLAB implementations) or a , stored either on-slab (within the slab's initial space) or off-slab (in a dedicated ) to track free object positions without invading object space. For instance, in the original slab design, slabs under 1/8 page use kmem_bufctl for efficiency, while the SLAB allocator uses an on-slab kmem_bufctl_t array for objects under 512 bytes, initializing it as a pseudo-linked list with sequential indices ending in a like BUFCTL_END. Common strategies for freelist organization include LIFO (last-in, first-out), where objects are added and removed from the head of the list for simplicity and cache locality, though FIFO (first-in, first-out) variants exist in some adaptations. In the SLUB allocator, freelists operate at the page level, with metadata embedded in the struct page (using fields like freelist for the head pointer, inuse for allocated count, and offset for pointer placement) and pointers chained through free objects themselves, enabling lockless per-CPU access. Maintenance during operations is straightforward: on deallocation, the object is pushed to the front of the freelist by updating the previous head to point to it, and the reference count is decremented; on allocation, the head is popped, metadata cleared (e.g., zeroing the embedded pointer), and the count incremented, with the slab transitioned to full or empty lists as needed. The original design exemplifies this with each slab holding a freelist head in its kmem_slab structure, while Linux SLAB updates the slab_t->free index to traverse the kmem_bufctl_t array. Optimizations focus on reducing traversal costs and lock contention, such as batching multiple allocations or deallocations to process freelists in bulk before updating cache-wide structures, and efficiently handling transitions between full, partial, and empty slab states via sorted lists or reference counts. For example, SLUB batches freelist transfers to per-CPU structures to avoid frequent locking of the struct page, while the original allocator uses simple pointer swaps for constant-time pushes and pops, reclaiming fully free slabs only under pressure. These techniques ensure scalable performance, with SLAB's array-based indexing allowing O(1) free object lookup regardless of slab fullness.

Variations

Original Slab Allocator

The original slab allocator, introduced by Jeff Bonwick in the 5.4 (later 2.4), represents a foundational approach to by organizing fixed-size objects into reusable slabs to minimize fragmentation and initialization overhead. This design caches objects such as inodes and structures in dedicated slabs, allowing rapid allocation and deallocation without repeated calls to lower-level allocators. Core features include explicit constructors and destructors to initialize and clean up object state, ensuring that objects are in a valid condition upon allocation and release their resources properly upon deallocation. Additionally, caches enable batch allocations by grouping multiple objects into "magazines" for efficient transfer, serving as a precursor to per-CPU caching mechanisms in later implementations. Slab coloring, tailored for Sun hardware like systems, offsets object placements within slabs to optimize cache line utilization and balance bus traffic, reducing contention on multiprocessor buses. Slabs in this allocator are classified into three types based on their occupancy: full slabs contain all objects allocated, partial slabs have a mix of allocated and free objects managed via per-slab freelists, and empty slabs hold no allocations and await reuse. Magazines facilitate small and full transfers of objects between the kernel's central depot and device drivers or other kernel subsystems, allowing bulk operations to amortize locking costs and improve throughput for high-frequency allocations. The allocator integrates seamlessly with the kmem framework, which handles variable-sized allocations, by dedicating approximately 30 slab caches for common object sizes ranging from 8 bytes to 9 . It supports variable slab sizes determined by object alignment requirements, limiting internal fragmentation to at most 12.5% by ensuring slabs are sized as multiples of the power-of-two alignment. Performance optimizations in the original design target uniprocessor (UP) and architectures, employing per-cache locks to serialize access within each object cache while allowing concurrent operations across different caches, which reduced average allocation and free times from 33 µs to 5.7 µs in benchmarks. However, the reliance on global locking for depot operations created a bottleneck in symmetric multiprocessor () environments, limiting scalability without subsequent modifications. This design laid the groundwork for ports to other systems and enhancements, such as refined magazine mechanisms for better multiprocessor support.

Linux SLAB Allocator

The Linux SLAB allocator was introduced in the Linux kernel version 2.2 in 1999 by Spraul, who ported and adapted Jeff Bonwick's original slab design from to suit the x86 architecture and Linux's memory management framework. This implementation replaced the earlier kmalloc allocator, providing a more efficient object-caching mechanism tailored to kernel needs, such as rapid allocation of frequently used structures like inodes and task structs. While drawing from the primitives for slab caches and freelists, the Linux version incorporated platform-specific optimizations to handle the buddy allocator's page-based allocations and constraints. A key enhancement in the SLAB allocator was the addition of debugging features to detect common errors. Redzoning places markers at the boundaries of allocated objects to buffer overflows, while fills freed or uninitialized objects with a distinctive (typically 0x5a) to identify invalid accesses. These capabilities are controlled through kmem_cache creation flags, such as SLAB_RED_ZONE for enabling redzoning and SLAB_POISON for , allowing developers to activate them selectively via kernel configuration options like CONFIG_DEBUG_SLAB. Such features proved invaluable for robustness in production kernels, though they incur a performance overhead due to additional checks and metadata. The allocator's structure emphasizes per-CPU efficiency and scalability, particularly in multi-processor environments. Each slab cache maintains per-CPU arrays for local object freelists to minimize contention, supplemented by alien[] arrays that stage remote frees from other CPUs or NUMA nodes, reducing cross-node traffic. For memory accounting in containerized environments, SLAB integrates objcg (object cgroup) support, enabling per-cgroup tracking of slab allocations through the memory control group (memcg) subsystem; this charges individual objects to specific upon allocation, facilitating fine-grained resource limits. Unlike the original design, which relied on magazines for tiered caching, SLAB uses simpler array-based freelists for free object management, streamlining implementation while integrating with vmalloc for caches involving larger objects that exceed typical boundaries. By the , the original SLAB allocator had been largely deprecated in favor of the simpler SLUB variant, which became the default in Linux 2.6.23 (2007) due to reduced complexity and better performance on modern hardware. Nonetheless, SLAB remains configurable via the SLAB kernel build option and continues to see use in distributions like and certain setups where its debugging maturity is preferred.

SLUB Allocator

The SLUB allocator, developed by Christoph Lameter in 2007, was introduced as a streamlined replacement for the SLAB allocator to address its growing complexity and scalability issues in multi-processor environments. It was merged into the with version 2.6.22 in July 2007, becoming the default allocator starting from kernel 2.6.23 due to its simpler codebase and reduced overhead. Unlike its predecessor, which relied on extensive queuing mechanisms, SLUB prioritizes efficiency on modern hardware by minimizing locks and metadata structures, making it particularly suited for high-throughput workloads on multi-core systems. At its core, SLUB manages memory through per-page freelists, where free objects within a slab page are linked directly using pointers stored in the objects themselves, avoiding additional per-object overhead. All slab is embedded in the kernel's struct page, enabling seamless with the page allocator and eliminating separate slab descriptors. To achieve lock-free operations on the fast path, SLUB utilizes per-CPU caches of partial slabs—pages that are neither full nor empty—allowing allocations and deallocations to proceed without global synchronization in most cases. This design contrasts with SLAB's per-CPU array caches by forgoing batching optimizations for small objects in favor of a uniform freelist approach, which simplifies code while maintaining low-latency access. SLUB includes several targeted optimizations and features for robustness and debugging. For small objects, the absence of SLAB-style batching reduces complexity, though it may slightly increase cache pressure in some scenarios. Debugging capabilities, such as the SLUB_REDZONE option, add padding bytes around allocated objects to detect overflows and memory corruption during development or troubleshooting. Additionally, SLUB supports huge pages (orders greater than zero) for caches requiring larger contiguous allocations, improving efficiency in memory-intensive applications. In terms of performance, initial benchmarks showed SLUB delivering 5-10% faster allocation speeds compared to SLAB, attributed to its leaner code paths and reduced . It also enhances (NUMA) systems by facilitating page migration: since resides solely in the struct page, entire slabs can be relocated between nodes without custom handling, improving locality and reducing remote access latencies. As of 2025, SLUB continues as the sole general-purpose slab allocator in the mainline , following the deprecation and removal of SLAB in version 6.8; ongoing refinements ensure compatibility and optimization for architectures like arm64 and . In 6.18 (2025), introduced "sheaves," an opt-in per-CPU caching mechanism for SLUB to further reduce lock contention on high-core-count systems.

SLOB Allocator

The SLOB (Simple List of Blocks) allocator was introduced by Matt Mackall in 2006 as a lightweight alternative to the SLAB and SLUB allocators, specifically targeting resource-constrained embedded environments with less than 2 MB of . Its design centers on a single global freelist that encompasses all free objects, organized by size categories using simple singly-linked lists, which eliminates per-cache overhead and relies on the underlying allocator for expansion. Allocation proceeds via a first-fit search through the appropriate size-based list, while deallocation merges freed blocks back into the list, all managed within a unified arena without dedicated slab structures. This minimalist approach draws from traditional K&R-style heap management, ensuring compatibility with kmalloc and kmem_cache operations but with granular 8-byte alignments on architectures like x86. While offering constant O(1) space overhead and a compact codebase of around 600 lines, SLOB trades off higher internal fragmentation from its first-fit strategy and lacks optimizations such as slab coloring or per-CPU caches, rendering it ideal for uniprocessor (UP) embedded systems where minimal footprint outweighs allocation speed. In the , SLOB is enabled through the CONFIG_SLOB configuration option, often selected in embedded builds by disabling CONFIG_SLUB or CONFIG_SLAB, allowing it to serve as the default kmalloc backend from a single contiguous arena. However, its global locking and potential for linear-time traversals make it unsuitable for high-performance or multi-core workloads. By the early 2020s, SLOB was deprecated in 6.2 in favor of SLUB and subsequently removed entirely in version 6.4 due to maintenance burdens and the prevalence of more capable alternatives.

Adoption in Operating Systems

Solaris

The slab allocator forms the core of the kernel memory (kmem) allocator, introduced in Solaris 2.4 in 1994 to manage all kernel memory allocations except page-level structures, replacing the previous buddy allocator from SVR4 Unix. This implementation, designed by Jeff Bonwick, uses object-caching to minimize initialization overhead for frequently allocated kernel objects by maintaining pre-initialized slabs in caches tailored to specific object sizes and types. Subsequent extensions integrated the slab allocator with advanced features, including support for introduced in 10 in 2005, where kmem caches allocate memory for ZFS components such as the Adaptive Replacement Cache () buffers. The allocator also incorporates auditing capabilities, such as redzone and deadzone checks to detect overwrites and use-after-free errors, along with tools like the Modular Debugger's ::findleaks command, which traces unreferenced allocations in crash dumps. Additionally, it supports Zones by providing isolated kmem caches per zone to enhance resource separation in consolidated environments. As of 11.4 (released in 2018) and its updates through 2025, the kmem allocator includes probes under the kmem provider (e.g., kmem:::alloc and kmem:::free) for real-time monitoring of allocation patterns and performance. Common usage includes allocating door descriptors for and ARC buffers for caching, with the slabstat(1M) tool providing statistics on utilization and fragmentation. The allocator scales efficiently to petabyte-scale systems, leveraging per-CPU magazines for low-overhead access in multi-socket environments with thousands of cores.

Linux

Slab allocation was first integrated into the with version 2.2 in 1999, introducing the original SLAB allocator inspired by implementations. Over time, it evolved with the addition of the SLOB allocator for systems and the SLUB allocator, which became the default in version 2.6.23 in 2007 due to its improved performance and simpler design. SLOB was deprecated in kernel 6.2 and removed in kernel 6.4 (2023), while SLAB was deprecated in 6.5 and removed in 6.8 (2024), leaving SLUB as the primary general-purpose allocator as of 2025. The slab allocator underpins key memory management functions, including the kmalloc() for small allocations and aspects of vmalloc() for larger mappings. Configuration of the slab allocator, particularly SLUB, can be tuned via kernel boot parameters to optimize for specific workloads. For instance, the slub_max_order parameter sets the maximum page order for slab allocations, with a default of 0 imposing no artificial limit beyond system constraints; higher values (e.g., 1 or 2) reduce fragmentation on high- systems but risk out-of-memory conditions on low-RAM machines. Runtime statistics are accessible through /proc/slabinfo, which reports per-cache details like object counts, active usage, and consumption for allocation patterns. In practice, the slab allocator manages caches for frequently allocated structures, such as task_struct for descriptors, dentry for entries in the virtual filesystem layer, and sk_buff for network packet buffers. Integration with control groups () extends this to containerized environments, where the enforces limits on kernel memory (kmem) usage, including slab allocations, via interfaces like memory.kmem.limit_in_bytes to prevent resource exhaustion in isolated workloads. Monitoring tools like slabtop provide real-time views of top slab caches sorted by metrics such as memory usage or object count, aiding in identifying leaks or hotspots without halting the system. For , the CONFIG_DEBUG_SLAB configuration option enables features like , red-zoning, and tracepoints to detect corruption or double-frees, with runtime toggling via boot parameters like slab_debug=FZ. As of 2025, SLUB remains the dominant allocator across distributions, benefiting from ongoing optimizations for scalability and security, including support for slab allocation in Rust-based components such as objects, enhancing in the Rust-for-Linux ecosystem without disrupting C-based code.

Other Systems

implements Memory Allocator (UMA), a slab-inspired mechanism introduced in 1998 with the initial zone allocator and refined in 5.0 (2003) to function explicitly as a slab allocator using zones for object collections and kegs as backing caches for fixed-size items. UMA manages dynamically sized collections of identical objects, serving as the backend for functions like malloc and the allocation of (vm) objects, thereby reducing initialization overhead for frequently used structures. NetBSD and OpenBSD employ pool allocators that operate in a slab-like manner, caching pre-initialized buffers of fixed sizes to accelerate allocation and deallocation of structures such as processes, sockets, and file descriptors. In , the (9) interface provides a resource manager for fixed-size buffers, maintaining per-pool caches to minimize fragmentation and support efficient reuse, akin to slab caches. OpenBSD's (9) similarly uses slab-style caching with dedicated zones for objects, emphasizing features like in allocation to mitigate exploits while handling structures like mbufs and uvm objects. The kernel incorporates partial analogs to slab allocation through its executive pool manager, which uses lookaside lists for fast allocation of small, frequent objects via functions like ExAllocatePool, though it lacks a pure slab implementation and relies on paged and non-paged pools for general memory. Research into NT kernel variants has proposed slab-like enhancements to the pool system to improve object caching and reduce overhead for driver and subsystem allocations. In embedded environments, Android leverages the SLUB allocator from the Linux kernel for efficient kernel memory management, particularly in handling device drivers and system services on resource-constrained mobile hardware. Real-time operating systems like Zephyr include minimal slab ports, where memory slabs serve as kernel objects enabling dynamic allocation of fixed-size blocks from pre-designated regions, supporting low-latency operations in IoT and embedded applications without the full complexity of general-purpose slabs.

Advantages and Limitations

Benefits

Slab allocation significantly reduces memory fragmentation compared to traditional systems by grouping objects of identical sizes into fixed slabs, limiting internal fragmentation to a maximum of 12.5% per slab and achieving overall of approximately 14% under heavy workloads, in contrast to 46% observed in comparable allocators like those in SVR4. This efficiency stems from pre-allocating contiguous blocks tailored to specific object sizes, minimizing both internal within slabs and external fragmentation across the heap. Allocation and deallocation operations in slab allocators are notably faster, with allocation averaging 3.8 microseconds in early implementations, compared to 25.0 microseconds in prior versions and 9.4 microseconds in SVR4 systems; these benchmarks, derived from evaluations, highlight the O(1) enabled by per-cache freelists. Object caching further accelerates repeated allocations by avoiding reinitialization, reducing times from 33 microseconds to 5.7 microseconds for complex structures like stream heads. The pre-initialization of objects in slabs lowers overhead by eliminating per-allocation setup costs, achieving savings of up to 83% in cycles for object construction in workloads, as evidenced by reduced execution time by 5% in server benchmarks. Extensions like per-CPU magazines enable to thousands of cores by distributing locks and caches, delivering up to 16-fold throughput gains on 16-CPU systems and 50% improvements in benchmarks like LADDIS on multi-socket . Slab allocation provides deterministic behavior with predictable latencies, making it suitable for kernels where allocation times remain bounded without searches or coalescing delays; enhancements to handle remote frees ensure predictability in multi-core systems. Empirical evaluations of the Linux SLUB variant, a streamlined slab implementation, demonstrate throughput improvements of 5-10% in benchmarks like kernbench on multi-core systems, outperforming earlier SLAB designs in CPU-intensive tests. Recent enhancements, such as the Sheaves implementation in 6.18 (as of October 2025), provide up to 30% gains in multi-threaded workloads on EPYC processors.

Drawbacks

Despite its design to minimize fragmentation, slab allocation can still suffer from internal fragmentation when object sizes do not align perfectly with slab boundaries, leading to wasted space within slabs; for instance, the original implementation capped this at 12.5% as an empirical trade-off between space efficiency and allocation speed. Memory overhead arises from metadata storage for slab management, such as descriptors and pointers, which can consume 5-10% additional space depending on object size; in Linux's SLUB variant, this contributes to higher overall usage compared to simpler allocators like SLOB, with examples showing 32200 kB versus 30800 kB in certain workloads. Per-CPU caches, intended to reduce contention, may hold unused objects, exacerbating idle memory retention across multiple processors. The allocator's complexity stems from intricate data structures like cache lists (full, partial, free) and coloring mechanisms to optimize performance, increasing code size and maintenance efforts; this also complicates debugging memory leaks, as cached objects obscure allocation patterns. Early designs faced lock contention in shared caches, though later variants like SLUB mitigate this at the cost of added implementation layers. Scalability limitations appear in multi-node NUMA systems, where global cache reaping ignores node locality, potentially leading to inefficient cross-node allocations; per-CPU structures help but can under extreme loads, prompting alternatives like percpu_alloc for very small objects. Security concerns include vulnerability to use-after-free exploits, as cached and recycled objects may retain sensitive data if not properly zeroed or ; while features like object poisoning detect overflows, the caching mechanism introduces unpredictability that attackers can leverage for manipulation, bypassing type separation restrictions in SLAB/SLUB.

References

  1. [1]
    [PDF] The Slab Allocator: An Object-Caching Kernel Memory Allocator
    Abstract. This paper presents a comprehensive design over- view of the SunOS 5.4 kernel memory allocator. This allocator is based on a set of object-caching.
  2. [2]
    Slab Allocator - The Linux Kernel Archives
    The basic idea behind the slab allocator is to have caches of commonly used objects kept in an initialised state available for use by the kernel.
  3. [3]
    Memory Allocation Guide - The Linux Kernel documentation
    If you need to allocate many identical objects you can use the slab cache allocator. The cache should be set up with kmem_cache_create() or ...
  4. [4]
    [PDF] Proceedings of the 2001 USENIX Annual Technical Conference
    Jeff Bonwick (bonwick@eng.sun.com) is a Senior. Staff Engineer at Sun Microsystems. He works primarily on core kernel services (allocators, lock primitives ...Missing: origin | Show results with:origin
  5. [5]
    The Slab Allocator: An Object-Caching Kernel - USENIX
    This paper presents a comprehensive design overview of the SunOS 5.4 kernel memory allocator. This allocator is based on a set of object-caching primi- tives.
  6. [6]
  7. [7]
    [PDF] Slab allocators in the Linux Kernel: SLAB, SLOB, SLUB
    Oct 3, 2014 · Slab allocators available. • SLOB: K&R allocator (1991-1999). • SLAB: Solaris type allocator (1999-2008). • SLUB: Unqueued allocator (2008-today).
  8. [8]
    The SLUB allocator - LWN.net
    The SLUB allocator, a drop-in replacement for the slab code. SLUB promises better performance and scalability by dropping most of the queues and related ...
  9. [9]
    What's next for the SLUB allocator - LWN.net
    May 20, 2024 · Meanwhile, Babka is not satisfied with removing just SLOB and SLAB; next on the target list is the special allocator used by the BPF subsystem.
  10. [10]
    Chapter 6 Physical Page Allocation - The Linux Kernel Archives
    This chapter describes how physical pages are managed and allocated in Linux. The principal algorithmm used is the Binary Buddy Allocator.
  11. [11]
    Linux Kernel vs. Memory Fragmentation (Part I) - High Scalability -
    Jun 8, 2021 · In this post, I'll introduce some common extensions to the buddy allocator that helps prevent memory fragmentation in the Linux 3.10 kernel.
  12. [12]
    [PDF] CMSC 420: Lecture 15 Memory Management
    Because it limits block sizes, internal fragmentation (the waste caused when an allocation request is mapped to a larger block size) becomes an issue. The ...
  13. [13]
    Lecture 27, Dynamic Storage Allocation - University of Iowa
    Since the binary buddy system is expected to waste 25 percent of the allocated memory to internal fragmentation, this suggests that the buddy system is the ...
  14. [14]
    [PDF] Allocating Memory - LWN.net
    The Linux kernel offers a richer set of memory allocation primitives, however. In this chapter, we look at other ways of using memory in device drivers and how ...
  15. [15]
    Kmalloc Internals: Exploring Linux Kernel Memory Allocation
    The constructor/destructor can be passed by the creator of the cache, and ... If a destructor was specified, then we call it for each object in the slab.<|control11|><|separator|>
  16. [16]
    [PDF] Magazines and Vmem: Extending the Slab Allocator to Many CPUs ...
    The slab allocator caches relatively small objects and relies on a more general−purpose backing store to provide slabs and satisfy large allocations. We ...
  17. [17]
    Linux SLUB Allocator Internals and Debugging, Part 1 of 4
    Dec 6, 2022 · Hence, the Linux kernel has 3 flavors of slab allocators namely, SLAB, SLUB and SLOB allocators. The SLUB allocator is the default and most ...
  18. [18]
    ./mm/slab.c - Verifysoft
    * The flags are * * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5) * to catch references to uninitialised memory. * * %SLAB_RED_ZONE ...
  19. [19]
    Short users guide for the slab allocator
    This has a higher likelihood of resulting in slab allocation errors in low memory situations or if there's high fragmentation of memory.
  20. [20]
    [PDF] The slab allocators of past, present, and future
    Sep 12, 2022 · ... The New Frontiers “ by Uresh Vahalia and “The Slab. Allocator: An Object-Caching Kernel Memory Allocator” by Jeff Bonwick (Sun. Microsystems).
  21. [21]
    Use obj_cgroup APIs to charge the LRU pages - LWN.net
    Jun 21, 2022 · All the kernel memory are charged with the new APIs of obj_cgroup. commit f2fe7b09a52b ("mm: memcg/slab: charge individual slab objects ...
  22. [22]
    Linux_2_6_22 - Linux Kernel Newbies
    As result, a new slab allocator called "SLUB" has been developed by Christoph Lameter from SGI, to solve those and other problems. Its design is simpler ...
  23. [23]
    Linux 6.8 To Drop The SLAB Allocator, SLUB Optimizations Coming ...
    Dec 5, 2023 · Dropping SLAB lightens the kernel load by around 5k lines of code and most important makes it easier to improve SLUB moving forward and having ...
  24. [24]
    slob: introduce the SLOB allocator - LWN.net
    From: Matt Mackall <mpm@selenic.com> ; To: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org ; Subject: [PATCH 2/2] slob: introduce the SLOB allocator.Missing: 2006 | Show results with:2006
  25. [25]
    What to choose between Slab and Slub Allocator in Linux Kernel?
    Mar 18, 2013 · Slub is simpler than Slab. SLOB (Simple List Of Blocks) is a memory allocator optimized for embedded systems with very little memory—on the ...How can I change the memory allocator in a Linux 3.6.6 server to ...What's the difference between slab and buddy system?More results from stackoverflow.com
  26. [26]
    The Slab Allocator in the Linux kernel - hammertux
    Jan 9, 2020 · Every time we try to allocate an object using the SLOB allocator, the slob_alloc() routine will iterate through each partially free page in one ...
  27. [27]
    SLOB nears the end of the road - LWN.net
    Dec 23, 2022 · SLOB is a traditional K&R/UNIX allocator with a SLAB emulation layer, similar to the original Linux kmalloc allocator that SLAB replaced.
  28. [28]
    SLOB Removal Submitted Ahead Of The Linux 6.4 Kernel Cycle
    Apr 21, 2023 · Linux 6.2 deprecated the SLOB allocator with kernel developers recommend SLUB memory allocator being used instead. Removing SLOB lowers the ...
  29. [29]
    [PDF] KERNEL MEMORY - Dartmouth Computer Science
    The slab allocator was introduced in Solaris 2.4, replacing the buddy allocator that was part of the original SVR4 Unix. The reasons for introducing the slab ...
  30. [30]
    Changes to ZFS ARC Memory Allocation in 11.3 - Oracle Blogs
    Jul 7, 2015 · ZFS ARC. Prior to Solaris 11.3, the ZFS ARC allocated its memory from the kernel heap. space using kmem caches. This has several drawbacks ...
  31. [31]
    Chapter 9 Debugging With the Kernel Memory Allocator
    The Oracle Solaris kernel memory (kmem) allocator provides a powerful set of debugging features that can facilitate analysis of a kernel crash dump.
  32. [32]
    [PDF] Solaris Zones: Operating System Support for Consolidating ...
    Nov 19, 2004 · Hardware partitioning, while providing a very high degree of application isolation, is costly to implement and is generally limited to high-end ...
  33. [33]
    Probe Descriptions - Oracle Help Center
    You can also use patterns to list matching probes by using the patterns on the command line with the dtrace -l command. For example, the command dtrace -l -f ...
  34. [34]
    Oracle Solaris Kernel Zones and Large Pages
    Jul 4, 2019 · Anything larger than this smallest page size is seen as a "Large Page". The reason these large pages are important is for performance.
  35. [35]
    Kernel Memory Allocator - Oracle® Solaris 11.3 Tunable Parameters ...
    The Oracle Solaris kernel memory allocator distributes chunks of memory for use by clients inside the kernel. The allocator creates a number of caches of ...Missing: kmem large scale
  36. [36]
    Linux's SLAB Allocator Is Officially Deprecated - Phoronix
    Jun 30, 2023 · Following the path of SLOB, Linux's SLAB memory allocator is now officially deprecated beginning with the Linux 6.5 kernel series.
  37. [37]
    The kernel's command-line parameters
    KNL Is a kernel start-up parameter. Parameters denoted with BOOT are actually interpreted by the boot loader, and have no meaning to the kernel directly.
  38. [38]
    Short users guide for the slab allocator — The Linux Kernel documentation
    ### Summary of Slab Allocator in Linux (Focus on SLUB as Default)
  39. [39]
    Memory Resource Controller - The Linux Kernel documentation
    Root cgroup has no limit controls. Kernel memory support is a work in progress, and the current version provides basically functionality. (See section 2.7).
  40. [40]
    slabtop(1) - Linux manual page - man7.org
    slabtop displays detailed kernel slab cache information in real time. It displays a listing of the top caches sorted by one of the listed sort criteria. It also ...Missing: tool | Show results with:tool
  41. [41]
    CONFIG_DEBUG_SLAB: Debug slab memory allocations
    General informations. The Linux kernel configuration item CONFIG_DEBUG_SLAB has multiple definitions: Debug slab memory allocations found in mm/Kconfig.
  42. [42]
    As the Kernel Turns: Rust in Linux saga reaches the “Linus in all ...
    Rust, a modern and notably more memory-safe language than C, once seemed like it was on a steady, calm, and gradual approach into the Linux kernel.
  43. [43]
    ExAllocatePool function (wdm.h) - Windows drivers - Microsoft Learn
    Jan 13, 2023 · This routine is used for the general pool allocation of memory. If NumberOfBytes is PAGE_SIZE or greater, a page-aligned buffer is allocated.
  44. [44]
    [PDF] Kernel Pool Exploitation on Windows 7 - Media.blackhat.com…
    Resource for dynamically allocating memory. ▻ Shared between all kernel modules and drivers. ▻ Analogous to the user-mode heap.<|separator|>
  45. [45]
    Memory Slabs - Zephyr Project Documentation
    A memory slab is a kernel object that allows memory blocks to be dynamically allocated from a designated memory region.Missing: port | Show results with:port
  46. [46]
    System memory allocation using the malloc subsystem - IBM
    Note: AIX® uses a delayed paging slot allocation technique for storage allocated to applications. When storage is allocated to an application with a ...
  47. [47]
    None
    ### Summary on Scalability, Performance Metrics, and Quantitative Data
  48. [48]
    [PDF] Scalable Memory Reclamation for Multi-Core, Real-Time Systems
    We further improve the slab allocator by handling “remote frees” predictably. However, as real-time memory allocation is out of scope, these details can be ...
  49. [49]
    [PDF] Status of the Linux Slab Allocators
    Feb 26, 2011 · Freelist has a watermark that, when passed, flushes free objects back to ... Slab allocation development. ○. Subsystem co-maintainers. ○. Pekka ...Missing: high growth
  50. [50]
    [PDF] Unleashing Use-After-Free Vulnerabilities in Linux Kernel
    The SLAB/SLUB allocators introduce mainly two re- strictions to an attack. First, the heap management mech- anism adopted by Linux kernel generally prevents ...