Fact-checked by Grok 2 weeks ago

Unrolled linked list

An unrolled linked list is a linear data structure that extends the traditional linked list by storing multiple elements—typically in a fixed-size array—within each node, rather than a single element per node. The concept was introduced in 1994 by Zhong Shao, John H. Reppy, and Andrew W. Appel. This design reduces the overhead of pointers and links, making it a hybrid between arrays and linked lists. Introduced in the context of functional programming optimizations, unrolled linked lists aim to balance the dynamic sizing of linked lists with the contiguous memory access benefits of arrays. The structure typically employs doubly linked nodes, where each node contains an array of elements and metadata indicating the current occupancy to allow for efficient insertions and deletions without excessive rebalancing. For instance, when inserting an element, it is added to the array in the current if space is available; otherwise, a new is created and elements may be split across nodes to maintain balance. This approach provides improved performance for sequential operations like traversal, insertion, and erasure due to better cache locality, compared to standard linked lists. Key advantages of unrolled linked lists include enhanced efficiency due to sequential , which minimizes misses compared to traditional linked lists where elements are scattered in . They also offer better utilization by reducing the number of pointer fields, potentially halving space overhead for lists with small elements. These benefits make unrolled linked lists particularly suitable for applications involving frequent sequential reads or writes, such as in database indexing or functional language implementations, though they may incur slightly higher costs for compared to pure arrays.

Fundamentals

Definition

An unrolled linked list is a variant of a singly or in which each node contains a fixed-size capable of holding multiple data elements, typically 4 to 16 items, rather than a single element. This structure groups consecutive elements into blocks within nodes, connected by pointers to form the overall list. The primary purpose of an unrolled linked list is to merge the dynamic resizing and insertion flexibility of traditional linked lists with the efficiency of arrays, thereby reducing the overhead associated with numerous pointers and enhancing . By minimizing the frequency of pointer traversals, it addresses inefficiencies in access patterns common in standard linked lists. This design yields key benefits, particularly improved performance during sequential traversals in environments sensitive to hierarchies or constraints, where reduced pointer chasing leads to fewer cache misses and better utilization of caches. Unrolled linked lists emerged in the 1990s amid growing awareness of cache hierarchy challenges in modern processors, with the concept first described in the 1994 paper "Unrolling Lists" by Zhong Shao, John H. Reppy, and Andrew W. Appel, presented at the ACM Conference on and , which focused on optimizing representations for languages. Early explorations highlighted their potential for memory efficiency and access speed improvements.

Node Structure

In an unrolled linked list, each consists of a fixed-size capable of holding multiple elements, typically denoted as an of size K, where K is a compile-time constant chosen to optimize memory usage and access patterns. This stores the actual elements contiguously, accompanied by a count field that indicates the number of active or valid elements currently occupying the , allowing for efficient management of partially filled s. Additionally, the includes one or more pointers for linking to adjacent s: a next pointer for singly linked variants and both next and previous pointers for doubly linked implementations, facilitating traversal while minimizing the overhead of individual per-element pointers found in traditional linked lists. The choice of array size K is influenced by hardware characteristics, particularly the CPU cache line size, to maximize utilization and reduce cache misses during sequential access. For instance, on systems with a 64-byte line, if elements are 4-byte integers, a value of K = 16 allows the array to fill the line effectively, assuming minimal overhead from , thereby improving spatial locality. In practice, smaller values like K = 8 are used in concurrent implementations to balance overhead with efficiency. Partial nodes are handled through the count field, which tracks the exact number of occupied slots in the , enabling nodes to operate below full capacity without wasting space or requiring immediate reorganization. This design supports dynamic adjustments, such as during list modifications, where the count ensures only valid elements are accessed, preventing errors from unused portions. The memory layout of an unrolled linked list promotes contiguous storage of elements within the , which contrasts with the scattered allocation in standard linked lists and reduces indirection overhead by allowing multiple elements to be fetched in a single memory access. Typically, the structure might be represented in as:
struct UnrolledNode {
    Element array[K];    // Fixed-size array of elements
    int count;           // Number of active elements (0 <= count <= K)
    UnrolledNode* next;  // Pointer to next node
    // Optionally: UnrolledNode* prev; for doubly linked
};
This arrangement clusters related data, enhancing cache performance by aligning with how processors load and process memory blocks.

Operations

Insertion

Insertion into an unrolled linked list begins by locating the appropriate node and position within its fixed-size array to accommodate the new element. The process typically involves a linear traversal through the nodes, examining each node's count field to determine if the insertion index falls within its range, which exploits the reduced number of nodes compared to a traditional linked list. Once the target node is found, if the node's array has available space (i.e., the current count is less than the maximum capacity K), the new element is inserted at the specified offset, and the count is incremented; this step may require shifting subsequent elements within the array to maintain order. When the target node is full, splitting is necessary to create space. The algorithm creates a new node and redistributes the elements, typically moving approximately half (e.g., \lfloor K/2 \rfloor) to the new node while inserting the new element into the appropriate half, then updates the next and previous pointers to link the split nodes correctly. This redistribution ensures that both resulting nodes are roughly balanced, preventing excessive fragmentation. In implementations like the SEList, an alternative spreading strategy may redistribute elements across up to K+1 adjacent nodes if no immediate space is found within a bounded search radius, amortizing the cost over multiple operations. For insertions at specific positions, such as the head or tail, special handling applies. At the head, the first node is checked for space; if available, elements are shifted right to insert at the beginning of its array; otherwise, a new node is created and linked as the new head. Tail insertions similarly append to the last node's array or create a new trailing node. The following pseudocode illustrates the linear search to find the insertion position in a linked (assuming doubly-linked for bidirectional access):
function insert(position, value):
    if head is null:
        head = create_node()
        insert_into_node(head, 0, value)
        return
    current = head
    offset = 0
    prev = null
    while current != null:
        if offset + current.count > position:
            insert_into_node(current, position - offset, value)
            return
        prev = current
        offset += current.count
        current = current.next
    // Append to tail
    if prev.count < K:
        insert_into_node(prev, prev.count, value)
    else:
        new_node = create_node()
        insert_into_node(new_node, 0, value)
        prev.next = new_node
        new_node.prev = prev
This search proceeds node-by-node until the cumulative count exceeds the target position. The insert_into_node function handles shifting elements right if needed and increments the count. To maintain efficiency, an optional balancing strategy rebalances s post-insertion, aiming to keep them roughly half-full. This may involve merging underfilled adjacent s (e.g., if a node's count falls below a like K/4 after related operations) by combining their arrays and removing the extra , though such merging is often deferred or amortized to avoid frequent link updates. This approach enhances locality and prevents performance degradation from imbalanced sizes. The average time complexity for insertion is O(n/K) due to the node traversal, where n is the total number of elements and K is the node capacity, as fewer nodes reduce the search overhead compared to single-element nodes.

Deletion

Deletion in an unrolled linked list involves first locating the target element within a specific and array index, typically after a traversal to find it. Once identified, the element is removed by shifting all subsequent elements in the node's array one position to the left to fill the gap, ensuring no null entries remain, and then decrementing the node's count field. This shifting operation takes O(K) time, where K is the fixed array size per , as it may require moving up to K-1 elements. If the node's count falls below a predefined minimum —commonly K/4 or K/2 to space efficiency and fragmentation—the node is merged with an adjacent one to prevent underutilized nodes. Merging entails appending the elements from the underfilled node to the adjacent node's array (or vice versa, depending on which has more space), updating the count in the receiving node, and adjusting the links between nodes. The emptied node is then deallocated, and pointers from neighboring nodes are updated to bypass it. This process maintains the list's integrity by preserving the sequential order of elements across nodes. Edge cases include deletions from partially filled nodes, where shifting may not trigger merging if the threshold is not met, and from full nodes, which always require shifting but rarely immediate merging. If deletion empties a entirely, it is removed by relinking the previous and next nodes directly, handling the head or tail specially to avoid null pointer issues. In doubly-linked implementations, both forward and backward pointers must be updated symmetrically during merging or removal to ensure bidirectional traversal remains valid. The following pseudocode illustrates the core deletion logic within a node, assuming the node and index have been located:
function deleteInNode([node](/page/Node), [index](/page/Index)):
    // Shift elements after [index](/page/Index) to the left
    for i from [index](/page/Index) to node.count - 2:
        node.[array](/page/Array)[i] = node.[array](/page/Array)[i + 1]
    node.count -= 1
    // Optional: Set last position to null or invalid, but typically not needed in dense arrays
    
    // Check for merging if under threshold
    if node.count < MIN_THRESHOLD:  // e.g., K/4
        mergeWithAdjacent([node](/page/Node))
The mergeWithAdjacent function would handle transferring elements, updating counts, and relinking, with O(K) cost in the worst case due to array copies. Overall, while individual deletions cost O(K + traversal time), amortized analysis over multiple operations spreads merging costs, preserving efficient performance. Traversal in an unrolled linked list proceeds by following the pointers between nodes while sequentially scanning the fixed-size array within each node up to its occupancy count, which enhances efficiency for sequential access patterns compared to traditional linked lists by reducing pointer indirections. In a singly-linked variant, forward traversal begins at the head and continues until the tail, processing elements in order; for doubly-linked implementations, backward traversal mirrors this process using previous pointers, allowing bidirectional navigation without reversing the structure. The following illustrates forward sequential traversal, where each element in the 's array is visited:
function traverse_forward(head):
    current = head
    while current is not [null](/page/Null):
        for i from 0 to current.count - 1:
            process(current.array[i])
        current = current.next
This approach visits approximately n/K nodes for a list of n elements with capacity K, minimizing overhead from pointer chasing. Search for a specific value in an unrolled linked list employs a linear analogous to traversal, starting from the head and examining elements in each node's until the is found or the end is reached, with an optimization to first check the node's occupancy to skip empty slots if applicable. The is , as the search must examine up to all elements, though it traverses only O(n/K) nodes, reducing pointer overhead compared to a standard . Unlike arrays, unrolled linked lists lack direct indexing for , requiring traversal to the appropriate and offset, which takes O(\min\{i, n-i\}/K) time to reach the i-th element in balanced doubly-linked variants, making sequential or near-sequential accesses faster than in traditional lists but still linear overall. Improvements for can involve hybrid structures, such as adding auxiliary indexing layers, though these add complexity. Iterators for unrolled linked lists typically follow standard patterns like those in languages such as C++ or , maintaining internal state with a pointer to the current and an into its to enable incremental traversal without exposing the underlying structure. This design supports operations like incrementing to the next element by advancing the array or jumping to the next when the array is exhausted, ensuring with algorithms while leveraging the unrolled format for performance.

Performance Characteristics

Time Complexity

The time complexity of operations in an unrolled linked list depends on the node capacity K, which determines the number of elements stored per , and the total number of elements n. Traversal and require scanning all elements sequentially: following O(n/K) node pointers takes O(n/K) time, while processing the arrays within nodes takes O(n) time overall, yielding a total of O(n). However, the pointer-following cost is often highlighted as O(n/K) in analyses emphasizing the reduced linkage overhead compared to traditional linked lists. Insertion at an arbitrary position involves first locating the target , which requires traversing O(n/K) nodes in the worst case, followed by O(K) time to shift elements within the node and potentially split it if full. Thus, the worst-case is O(n/K + K). Deletion follows a similar pattern: locating the node takes O(n/K) time, and removing the element plus merging or shifting within the node (and possibly adjacent nodes) takes O(K) time, again yielding O(n/K + K) in the worst case. Amortized analysis, particularly for structures like the space-efficient list (SEList) variant, spreads the cost of occasional rebalancing (such as block splitting or gathering) across multiple operations. With a fixed K, operations near the list ends can achieve amortized O(1) time if insertions and deletions maintain balance, though arbitrary-position operations remain O(K + n/K) on average; this contrasts with worst-case spikes during rebalancing that can reach O(K^2). Choosing K \approx \sqrt{n} optimizes to amortized O(\sqrt{n}) per operation, balancing traversal and local work. The complexities are influenced by K and n: larger K reduces the n/K term (fewer nodes to traverse) but increases local operation costs, while n scales the traversal linearly; optimal K values (e.g., matching line sizes) minimize overall time for typical workloads.

Space Efficiency and Cache Locality

Unrolled linked lists achieve better space efficiency than traditional s primarily through a reduction in pointer overhead. In a conventional singly , storing n elements requires n nodes, each containing one and one pointer (typically 8 bytes on 64-bit systems), resulting in approximately 50% of the memory dedicated to pointers alone. By contrast, an unrolled linked list organizes K elements into a contiguous within each node, along with a single next pointer and a field to track the number of occupied slots, yielding only m ≈ n/K pointers overall. This node structure incurs O(K) space for the plus O(1) for the pointer and , leading to a total of O(n) while amortizing the fixed overhead across multiple elements per node. The consolidation of elements reduces the total number of pointers by a factor of K, significantly lowering memory usage compared to single-element nodes; for instance, with K tuned to cache line sizes (e.g., K=8 for 64-byte lines and 8-byte elements), this can enhance storage density without excessive fragmentation. However, drawbacks include potential wasted space in partially filled nodes, particularly the last one, and the fixed block size K that may limit adaptability to varying data sizes or access patterns. Regarding cache locality, the array-based storage within nodes promotes sequential access, aligning well with modern hierarchies where data is fetched in fixed-size lines (commonly bytes). Traversing an unrolled linked list loads an entire line to access up to K elements at once—approximately K elements per load versus one in standard linked lists—minimizing compulsory and capacity misses during linear scans or searches. This spatial locality can reduce cache misses relative to traditional linked lists, as the contiguous layout ensures that adjacent elements share cache lines. Empirical evaluations confirm these benefits, with implementations demonstrating up to 3× performance improvements. For example, in concurrent implementations using lazy synchronization, such gains have been observed over other concurrent list-based data structures on modern multiprocessors, attributed to enhanced utilization and reduced pointer chasing. Such gains are particularly pronounced in memory-bound scenarios, though they depend on selecting K appropriately for the hardware's cache parameters.

Comparisons and Applications

Comparison with Traditional Linked Lists

Unrolled linked lists and traditional linked lists both enable dynamic storage and efficient local modifications, but they diverge significantly in design and performance trade-offs due to the former's use of -based nodes containing multiple elements. In traditional singly or doubly linked lists, each holds a single element alongside one or two pointers, leading to scattered memory allocation and high overhead from frequent indirections. Unrolled linked lists group K elements (typically 10–100) into a contiguous per with a single next pointer (or two for doubly-linked variants), which fundamentally alters access patterns, memory efficiency, and operation costs. Access patterns in unrolled linked lists excel for sequential reads, as the contiguous storage within nodes minimizes cache misses by leveraging spatial locality—traversing K elements requires only one pointer dereference per node, compared to one per element in traditional lists. This reduces the effective traversal steps to O(n/K) pointer follows, yielding fewer cache line fetches during linear scans. However, random access remains O(n) in both structures, as locating an arbitrary element necessitates traversing nodes from the head or a known position, with similar pointer-chasing overhead once the target node is reached. Memory usage benefits from the unrolled approach through reduced pointer overhead: traditional linked lists allocate one pointer (4–8 bytes) per element, resulting in O(n) extra space, whereas unrolled lists use one pointer per K elements, cutting overhead by a factor of up to K and limiting extra space to O(n/K + waste from partial nodes). For instance, with K=32, pointer storage drops from ~50% of total memory in traditional lists (assuming 8-byte elements) to ~2%, though underfilled nodes can waste up to K-1 slots per node if not balanced. Optimized variants, like space-efficient lists, further minimize this to O(√n) extra space by dynamically adjusting block sizes around √n. Operation costs for insertion and deletion highlight key trade-offs. Traditional linked lists achieve O(1) time for these at a known position via simple pointer updates, without shifting data. In unrolled linked lists, insertions or deletions within a node require O(K) array shifts to maintain contiguity, and balancing (e.g., splitting full nodes or merging sparse ones) adds amortized O(1) but worst-case O(K) per operation, making them slower for isolated random modifications yet more efficient overall for large n due to fewer memory allocations and better cache reuse during bulk changes. Unrolled linked lists are advantageous for cache-bound workloads emphasizing , such as streaming or , where locality drives performance gains; traditional linked lists suit applications with frequent random insertions or deletions, prioritizing the simplicity of constant-time local updates over traversal efficiency. For example, benchmarks on sequential traversal of lists with 1 million elements show unrolled variants achieving up to 2× faster execution times compared to traditional linked lists, primarily from reduced misses.

Comparison with Arrays and Other Structures

Unrolled linked lists provide dynamic growth capabilities without the need for periodic resizing operations that incur O(n) time costs in fixed-size or dynamic , allowing for incremental expansion by adding new nodes as needed. This makes them particularly suitable for scenarios involving sparse data or frequent insertions, where might require costly reallocations to double their capacity and copy existing elements. However, unlike , which support O(1) random access via indexing, unrolled linked lists rely on sequential traversal through nodes, resulting in O(n) average-case access time for arbitrary elements. Compared to dynamic arrays such as C++ vectors, unrolled linked lists avoid the amortized costs associated with frequent resizing by distributing elements across fixed-capacity linked together, eliminating the need for large-scale copies during insertions. This can lead to more predictable in insert-heavy workloads, though they incur higher usage due to the overhead of pointers in each —typically requiring additional for linking compared to the contiguous storage in dynamic arrays. In relation to other hybrid structures like skip lists and B-trees, unrolled linked lists offer simplicity for linear access patterns, as they maintain a straightforward node-linking mechanism without the probabilistic level promotions of skip lists or the balancing requirements of B-trees. While skip lists and B-trees achieve O(log n) search times through multi-level indexing or node splitting/merging, unrolled linked lists exhibit O(n) search complexity but reduce traversal steps by a factor of the block size k, making them less balanced yet easier to implement for sequential operations. Unrolled linked lists can serve as a foundational component in more advanced hybrids, such as unrolled skip lists, where array-based nodes enhance storage density and cache locality within skip list frameworks. Regarding space , unrolled linked lists achieve similar to for the elements themselves but include linking overhead, resulting in approximately 1.1 to 1.5 times the of a pure in dynamic scenarios, depending on the block size k and pointer/element sizes; this trade-off provides link flexibility without the fragmentation risks of traditional linked lists.

Practical Use Cases

Unrolled linked lists find application in database systems for managing free lists of s, where each stores multiple identifiers to balance the flexibility of linked structures with efficient in embedded and lightweight databases. In environments, variants like the Detectable Unrolled Lock-based (DULL) support durable, scalable set operations for database indexing and recovery, enabling crash-consistent data structures that outperform traditional linked lists by over twofold in throughput for update-intensive workloads across 20 to 250 threads. In multi-threaded software, concurrent unrolled linked lists serve as building blocks for high-performance data structures such as graphs and hash tables, providing up to 300% higher throughput than lock-free alternatives in read-write scenarios due to reduced pointer dereferences and enhanced spatial locality. These structures are particularly beneficial in scenarios involving frequent sequential scans, such as processing event streams or log files, where their cache-friendly design minimizes memory access latency compared to standard linked lists. Implementations appear in programming libraries to address performance bottlenecks in general-purpose collections; for instance, the PLF C++ library offers an unrolled linked list as a for std::list, achieving 293% faster insertions, 57% faster erasures, and 17% faster iterations by storing multiple elements per node and optimizing allocator usage. Similarly, the language includes a built-in unrolled linked list module for tasks requiring efficient dynamic lists. Despite these advantages, unrolled linked lists introduce overhead from partially filled nodes and complex iteration logic, making them less suitable for very small lists where the per-node allocation cost outweighs benefits, or for applications demanding frequent , which favor hash tables instead. A in cache optimization demonstrates that unrolled linked lists can reduce memory access times by grouping elements into cache-line-sized arrays, yielding up to 40% runtime savings in traversal-heavy workloads compared to traditional linked lists.

References

  1. [1]
    [PDF] Abstract 1 Introduction - Princeton Computer Science
    Lists are ubiquitous in functional programs, thus sup- porting lists e ciently is a major concern to compiler writers for functional languages.
  2. [2]
    [PDF] Fall 2019 Lecture 21 – Priority Queues II + Bonus Data Structures I
    Nov 15, 2019 · An unrolled linked list data structure is a hybrid of an array / vector and a linked list. It is very similar to a standard doubly linked list, ...
  3. [3]
    Unrolling lists | ACM SIGPLAN Lisp Pointers
    We present a data structure for “unrolled lists”, where each cell has several data items (car fields) and one link (cdr). This reduces the memory used for ...Missing: al | Show results with:al
  4. [4]
    [PDF] DULL: A Fast Scalable Detectable Unrolled Lock-Based Linked List
    Until recently, there was no formal definition for detectability. Attiya et al. in [5] defined detectability through a correctness property they have ...
  5. [5]
    (PDF) Optimization of Cache Memory Using Unrolled Linked List
    Aug 6, 2019 · Data structures play an important role to reduce the cost of memory access. First array was introduced which gives fast cache-access, ...Missing: sources | Show results with:sources
  6. [6]
    [PDF] Unrolling Lists - Yale FLINT Group
    We present a data structure for "unrolled lists," where each cell has several data items (car fields) and one link. (cdr). This reduces the memory used for ...
  7. [7]
    LNCS 8878 - Practical Concurrent Unrolled Linked Lists Using Lazy ...
    Practical Concurrent Unrolled Linked Lists Using Lazy Synchronization. 389 than lock-based algorithms; they are harder to design, analyze, implement, and.
  8. [8]
    None
    Below is a merged summary of unrolled linked lists (specifically SEList from *Open Data Structures* in Java, Edition 0.1G) that consolidates all the information from the provided segments. To retain maximum detail in a dense and organized format, I’ll use a combination of narrative text and tables in CSV-like format where appropriate. The response includes all key aspects: node structure, insertion operation, splitting, performance, pseudocode, and useful URLs, while avoiding redundancy and ensuring completeness.
  9. [9]
    [PDF] Open Data Structures (in C++)
    is sometimes referred to as an unrolled linked list [67]. Another way to ... The heap-sort algorithm is another in-place sorting algorithm. Heap-sort.
  10. [10]
    3 Linked Lists - Open Data Structures
    The SEList is sometimes referred to as an unrolled linked list [69]. Another way to save space in a doubly-linked list is to use so-called XOR-lists. In an ...
  11. [11]
    Practical Concurrent Unrolled Linked Lists Using Lazy Synchronization | Request PDF
    **Summary of Sequential Traversal and Search Algorithms for Unrolled Linked Lists**
  12. [12]
    Unrolled Linked List | Set 1 (Introduction) - GeeksforGeeks
    Sep 11, 2023 · Like array and linked list, the unrolled Linked List is also a linear data structure and is a variant of a linked list.
  13. [13]
    A Fast Scalable Detectable Unrolled Lock-Based Linked List
    A linked list can be “unrolled” to combine multiple keys in each node; this improves storage density and overall performance. This organization also allows an ...
  14. [14]
    07. Free List: Recyle & Reuse - build-your-own.org
    A page-based list is just a linked list, except that a page can hold multiple items, like a B+tree node. This is also called an unrolled linked list. In summary ...
  15. [15]
    Practical concurrent unrolled linked lists using lazy synchronization
    A linked list can be “unrolled” to combine multiple keys in each node; this improves storage density and overall performance. This organization also allows an ...Missing: definition | Show results with:definition
  16. [16]
    The quest for the fastest linked list - Johnny's Software Lab
    Aug 4, 2021 · Using unrolled linked list increases the data cache hit rate, TLB ... introduced in C++17. Of course then they'll pay the price of the ...
  17. [17]
    UnrolledLinkedList — Chapel Documentation 2.6
    This module contains the implementation of the 'unrolledLinkedList' type. An unrolled linked list is a linked list of small arrays, all of the same size ...
  18. [18]
    Unrolled Linked List | Brilliant Math & Science Wiki
    An unrolled linked list is a linear data structure that is a variant on the linked list. Instead of just storing 1 element at each node, unrolled linked lists ...Missing: sources | Show results with:sources