Fact-checked by Grok 2 weeks ago

Shellsort

Shellsort is a comparison-based sorting algorithm that improves upon insertion sort by sorting elements that are initially far apart in the array before addressing closer elements, using a sequence of progressively smaller increments (or gaps) to define subarrays for sorting.^[1] Invented by Donald L. Shell and first described in 1959, it operates in-place and is unstable, meaning that equal elements may not retain their relative order.^[1] The algorithm begins with a large gap size h, applying insertion sort to each of the h independent subarrays formed by elements separated by h positions; it then reduces h according to a predefined sequence (such as Shell's original powers-of-2 decrements) until h = 1, at which point a standard insertion sort completes the process.^[2] This diminishing gap approach reduces the total number of comparisons and swaps compared to naive insertion sort, achieving subquadratic time complexity.^[3] The performance of Shellsort heavily depends on the choice of increment sequence, with Shell's original sequence yielding a worst-case time complexity of O(n^2), while optimized sequences like those proposed by Hibbard (h_k = 2^k - 1) or Sedgewick achieve O(n^{3/2}) or better, such as O(n^{4/3}) in some analyses.^[2] Theoretical bounds include a worst-case upper bound of O(n^{1 + \epsilon}) for any \epsilon > 0 with carefully chosen gaps, and a lower bound of \Omega(n (\log n / \log \log n)^2) comparisons.^[3] Empirical studies show Shellsort to be efficient for moderate-sized inputs due to its simplicity and low overhead, often outperforming more complex algorithms like quicksort on small arrays, though it lacks the adaptive properties of mergesort or heapsort.^[2] Variants include multi-pass strategies like h-bubble, h-shake, and h-brick passes to further minimize inversions, and randomized versions for data-oblivious sorting in secure computation contexts.^[2] Despite ongoing research into optimal gap sequences—spanning decades of work by researchers like Knuth, Pratt, and Incerpi—Shellsort remains valued for its ease of implementation and practical speed without requiring recursion or extra memory.^[3]

Overview

Description

Shellsort is a comparison-based sorting algorithm that generalizes insertion sort by allowing the comparison and exchange of elements that are far apart in the array, specifically those separated by a sequence of decreasing gaps known as h-sorts.^[1] Introduced by Donald Shell in 1959, it performs a series of insertion sorts on subarrays defined by these gaps, enabling elements to move more quickly toward their final positions compared to the adjacent swaps in standard insertion sort.^[1]^[4] The primary motivation for Shellsort stems from the limitations of insertion sort, which exhibits quadratic time complexity in the worst case due to the need to shift distant inverted elements one position at a time, leading to O(n²) operations for large inversions.^[5] By preprocessing the array with larger gaps, Shellsort reduces the number of such distant inversions early in the process, bringing elements roughly into their correct relative order before applying smaller gaps and ultimately a full insertion sort with gap 1.^[1] This approach leverages the adaptive nature of insertion sort on partially ordered data while mitigating its inefficiency on random inputs.^[4] At a high level, the algorithm begins with a large initial gap (often around n/2, where n is the array length) and applies insertion sort to each of the independent subarrays formed by elements spaced at that gap.^[5] The gap is then systematically reduced—typically by halving or following a predefined sequence—repeating the insertion sort passes on the new subarrays until the gap reaches 1, at which point the entire array is fully sorted.^[1] Various gap sequences can be used, though their selection is discussed in dedicated analyses.^[5] Shellsort is particularly advantageous as an in-place algorithm that requires no additional memory beyond the input array and adapts well to input data that is already partially sorted, making it efficient for moderately sized arrays in practice.^[4] Its simplicity in implementation and lack of recursion further contribute to its utility in resource-constrained environments.^[5] To illustrate, consider sorting the array [5, 2, 8, 1, 9] with an initial gap of 2. The subarrays are the elements at even indices (positions 0, 2, 4: [5, 8, 9], already sorted) and odd indices (positions 1, 3: [2, 1]). Applying insertion sort to the odd subarray swaps 2 and 1, resulting in [5, 1, 8, 2, 9]. This pass reduces inversions, such as moving 1 closer to its final position near the front, without resolving the full order yet—subsequent smaller gaps will refine it further.^[5]

History

Shellsort was invented by Donald L. Shell, an American computer scientist employed at the General Electric Company in Cincinnati, Ohio. In 1959, he published the algorithm in his seminal paper "A High-Speed Sorting Procedure" in Communications of the ACM, marking the first formal description of the method.^[1] Shell developed the algorithm during his work on computing systems in the late 1950s, drawing inspiration from existing sorting techniques such as insertion sort but innovating with a series of diminishing increments to enable more efficient element exchanges across larger distances early in the process. This approach addressed limitations in simple sorts for handling larger datasets in internal memory, reflecting the computational constraints of the era.^[1] Early empirical evaluations presented in Shell's paper revealed substantial performance gains over basic insertion sort, prompting its initial adoption in practical computing applications. The introduction of this adaptive strategy profoundly influenced later sorting algorithm research, establishing a foundation for exploring increment-based optimizations in comparison sorts.^[1] Post-1959, the algorithm underwent significant refinements, particularly in gap sequence design to mitigate worst-case inefficiencies. A notable early improvement came in 1963 when Thomas N. Hibbard proposed a sequence of gaps following powers of 2 minus 1 (e.g., 1, 3, 7, 15, ...), which empirical studies showed could achieve better average-case bounds than Shell's original powers-of-2 decrements. Further advancements in the 1960s through 1980s, including sequences by Donald E. Knuth in 1973, continued to enhance its theoretical and practical viability.^[3] In the 2020s, Shellsort remains a subject of theoretical investigation for insights into adaptive sorting behaviors, with recent work analyzing its optimization challenges and complexity under various gap configurations, including a 2025 study demonstrating expected O(n log² n) complexity using specific increment sequences.^[6]^[7]^[8] It also sees occasional use in modern contexts, such as the bzip2 compression utility, where its in-place nature and avoidance of deep recursion suit resource-limited environments.^[6]^[7]

Algorithm

Core Mechanism

Shellsort operates by iteratively sorting subsets of the input array defined by a decreasing sequence of gaps, known as h-sorting phases. For a given gap size h, the array is conceptually divided into h interleaved subarrays, where each subarray consists of elements separated by h positions (i.e., elements at indices i, i+h, i+2h, ..., for i = 0 to h-1). Each of these subarrays is then independently sorted using insertion sort, which involves shifting elements within the subarray to maintain order while operating directly on the original array positions. This process begins with the largest gap h (often on the order of n/3 or similar, depending on the gap sequence) and proceeds by reducing h in subsequent iterations until h reaches 1.^[1]^[9] The choice of gaps plays a crucial role in the algorithm's efficiency, as larger initial gaps address long-range disorder by allowing elements that are far apart to be compared and moved early, thereby reducing the number of inversions (pairs of elements out of their natural order) across the array more rapidly than a single insertion sort would. As the gaps diminish, the focus shifts to refining local order within smaller subarrays, progressively bringing the array closer to fully sorted. This staged reduction in gap sizes ensures that by the time h=1, the array is already partially ordered, minimizing the work required in the final phase. All operations—comparisons, shifts, and swaps—are performed in-place on the original array, requiring no additional storage beyond a few variables for indexing.^[1]^[10]^[9] The algorithm terminates when the gap h equals 1, at which point it executes a standard insertion sort on the entire array, which benefits from the prior phases having eliminated many distant inversions. To illustrate, consider a 7-element array A = [3, 7, 1, 5, 2, 4, 6] using gaps h=3 followed by h=1 (a simplified sequence for demonstration).

Initial array: [3, 7, 1, 5, 2, 4, 6]

For h=3, the three interleaved subarrays are:

Subarray 0 (indices 0,3,6): [3, 5, 6] — already sorted, no changes.
Subarray 1 (indices 1,4): [7, 2] — insert 2 before 7, shifting: array becomes [3, 2, 1, 5, 7, 4, 6].
Subarray 2 (indices 2,5): [1, 4] — 4 > 1, no change.
After h=3: [3, 2, 1, 5, 7, 4, 6]

For h=1, perform insertion sort on the full array:

Pass over elements, shifting out-of-order items: after full passes, the array sorts to [1, 2, 3, 4, 5, 6, 7].

This trace shows how the initial h=3 phase rearranges distant elements (e.g., moving 2 past 7), reducing inversions before the local h=1 refinement completes the sort.^[9]

Pseudocode

Shellsort is typically presented in pseudocode as an in-place sorting algorithm that operates on an array of comparable elements, generalizing insertion sort by applying it to subarrays defined by decreasing gap values until the array is fully sorted.^[1]^[4] The standard pseudocode, assuming a simple gap sequence where gaps halve from the array length divided by 2 down to 1, follows this structure:

procedure ShellSort(A: array of size n)
    if n <= 1 then
        return  // Empty or single-element array is already sorted
    end if
    
    gap ← floor(n / 2)
    while gap > 0 do
        for i ← gap to n-1 do
            temp ← A[i]
            j ← i
            while j >= gap and A[j - gap] > temp do
                A[j] ← A[j - gap]
                j ← j - gap
            end while
            A[j] ← temp
        end for
        gap ← floor(gap / 2)
    end while
end procedure
procedure ShellSort(A: array of size n)
    if n <= 1 then
        return  // Empty or single-element array is already sorted
    end if
    
    gap ← floor(n / 2)
    while gap > 0 do
        for i ← gap to n-1 do
            temp ← A[i]
            j ← i
            while j >= gap and A[j - gap] > temp do
                A[j] ← A[j - gap]
                j ← j - gap
            end while
            A[j] ← temp
        end for
        gap ← floor(gap / 2)
    end while
end procedure

This template initializes the first gap as approximately half the array size and iteratively reduces it by half, ensuring gaps are positive integers decreasing to 1.^[4] The outer loop iterates over each gap value, while the middle loop scans the array starting from the gap index, treating each subarray A, A[i+gap], A[i+2*gap], ... as a sequence to sort via insertion. The inner while loop performs the shifts and insertion for each element in the subarray, swapping positions by multiples of the gap when an inversion is detected (i.e., when A[j - gap] > temp).^[4] For edge cases, if the input array has zero or one element, the procedure terminates immediately without processing, as no sorting is needed; additionally, the gap loop condition ensures only positive gaps are used, preventing invalid operations.^[4] While this pseudocode employs Donald Shell's original gap sequence of successive halvings, the algorithm is adaptable to custom gap sequences by replacing the gap initialization and update with a predefined list of decreasing positive integers ending at 1, such as those analyzed in later refinements.^[1]^[2] To illustrate, consider applying the pseudocode to the array A = [5, 2, 8, 1, 9, 3] where n=6. Initially, gap=3. For i=3 (A^[11]=1), the subarray is A=5, A^[11]=1; since 5 > 1, shift: A^[11]=5, j=0, insert 1 at A, yielding [1, 2, 8, 5, 9, 3]. For i=4 (A^[12]=9), subarray A^[13]=2, A^[12]=9; since 2 < 9, no shift. For i=5 (A^[14]=3), subarray A^[15]=8, A^[14]=3; since 8 > 3, shift: A^[14]=8, then j=2, insert 3 at A^[15], yielding [1, 2, 3, 5, 9, 8]. Next gap=1, which performs a full insertion sort on the partially ordered array, resulting in the sorted [1, 2, 3, 5, 8, 9]. At each step, the inner loop compares and shifts only within the current gap's subarrays, progressively reducing disorder.^[4]

Gap Selection Process

In Shellsort, the gap selection process involves generating a sequence of decreasing increments, known as gaps, that determine the distances between elements compared in each pass of the algorithm. The original method, proposed by Donald Shell in 1959, starts with an initial gap h approximately equal to half the array size n (typically h = \lfloor n/2 \rfloor) and repeatedly halves it until reaching 1, yielding gaps such as n/2, n/4, n/8, \dots, 1.^[1] This sequence allows the algorithm to initially sort widely spaced subsequences and progressively refine the sorting with smaller gaps. Effective gap selection relies on specific criteria to optimize performance and minimize redundant operations. Gaps in the sequence should be pairwise coprime (i.e., their greatest common divisor is 1) to prevent clustering of elements and ensure even distribution across passes, as shared factors can lead to inefficient comparisons.^[16] Additionally, the sequence must be strictly decreasing to enable a refinement process where each subsequent pass builds on the partial order established by prior larger gaps, ultimately converging to a fully sorted array when the final gap is 1.^[16] The gaps are integrated into the algorithm either by precomputing the full sequence or generating them dynamically during execution. In the dynamic approach, common in simple implementations, the outer loop iterates over the gaps as follows:

for (int h = n / 2; h > 0; h /= 2) {
    // Perform insertion sort on subsequences spaced by h
    for (int i = h; i < n; i++) {
        int temp = a[i];
        int j;
        for (j = i; j >= h && a[j - h] > temp; j -= h) {
            a[j] = a[j - h];
        }
        a[j] = temp;
    }
}
for (int h = n / 2; h > 0; h /= 2) {
    // Perform insertion sort on subsequences spaced by h
    for (int i = h; i < n; i++) {
        int temp = a[i];
        int j;
        for (j = i; j >= h && a[j - h] > temp; j -= h) {
            a[j] = a[j - h];
        }
        a[j] = temp;
    }
}

This structure ensures that for each gap h, the array is partitioned into h independent subsequences, each sorted via insertion sort.^[1] Empirical guidelines for gap selection emphasize balancing the number of passes with the computational work per pass. Sequences should avoid powers of 2, as in Shell's original method, because they can exhibit pathological behavior on certain inputs, such as when n is a power of 2, leading to uneven element movement.^[3] Instead, designers test candidate sequences for their ability to reduce inversions efficiently across random and worst-case inputs, often aiming for a logarithmic number of gaps (\Theta(\log n)) to achieve subquadratic performance.^[16] A common pitfall in gap selection is choosing sequences where all gaps share a common factor, such as all even numbers, which restricts element exchanges to even indices and causes the algorithm to degrade to O(n^2) time complexity by failing to interleave subsequences effectively.^[3] Shell's original halving sequence, for instance, results in \Theta(n^2) worst-case comparisons due to such interactions.^[3]

Gap Sequences

Common Sequences

Shellsort's performance is highly dependent on the choice of gap sequence, with several well-established sequences developed over time to improve efficiency. The original sequence proposed by its inventor uses simple halvings, while subsequent refinements introduced geometric progressions and empirical optimizations. Donald Shell's original gap sequence, introduced in 1959, starts with an initial gap of h = n/2 and repeatedly halves it until reaching 1, yielding gaps such as n/2, n/4, n/8, \dots, 1.^[1] This approach is straightforward but can lead to suboptimal performance, with a worst-case time complexity of O(n^2).^[2] In 1963, Thomas Hibbard proposed a sequence of gaps given by h_k = 2^k - 1 for k = 0, 1, 2, \dots, producing values like 1, 3, 7, 15, 31, and so on.^[2] This sequence achieves a proven worst-case time complexity of O(n^{3/2}), marking an early theoretical improvement over Shell's method.^[2] Donald Knuth suggested a sequence in his seminal work on sorting algorithms, defined as h_k = \frac{3^k - 1}{2} for k = 1, 2, 3, \dots, generating gaps such as 1, 4, 13, 40, 121. This formulation balances the growth of gaps to reduce the number of inversions more evenly across passes, contributing to better practical behavior.^[2] Incerpi and Sedgewick developed a hybrid sequence in 1985, combining elements of powers of 2 and 3 for empirical efficiency. It merges two geometric progressions—one with terms $9 \cdot 4^k - 9 \cdot 2^k + 1 and the other $4^k - 3 \cdot 2^k + 1 (for k \geq 2)—resulting in gaps like 1, 5, 19, 41, 109, 209.^[17] This sequence is noted for its fast average-case performance in practice due to its distribution of gap sizes.^[18] Vaughan Pratt's sequence, outlined in his 1972 thesis, consists of all 3-smooth numbers (products of powers of 2 and 3, i.e., $2^a \cdot 3^b for non-negative integers a, b), arranged in decreasing order (largest first, down to 1) and truncated to those less than or equal to the array size; examples (in ascending order) include 1, 2, 3, 4, 6, 8, 9, 12.^[19] It is theoretically significant for achieving O(n \log^2 n) worst-case complexity, though the sequence grows rapidly and is less practical for typical implementations.^[19] A more recent empirical sequence by Marcin Ciura, published in 2001, was optimized through experimentation for array sizes up to $10^5. It uses fixed gaps: 1, 4, 10, 23, 57, 132, 301, 701, and so forth, selected to minimize average-case comparisons and swaps. This sequence outperforms many predecessors in benchmarks for moderate-sized inputs.^[20]

Sequence Properties and Design

Effective gap sequences in Shellsort are designed to balance efficiency across multiple insertion sort passes, with desirable properties including a geometric decrease in gap sizes—typically with ratios between 2 and 3—to ensure rapid reduction while maintaining coverage of the array elements.^[2] Such ratios, like 11/5 ≈ 2.2, promote even distribution of comparisons and minimize the number of inversions carried over between passes.^[2] Additionally, gaps should be pairwise relatively prime (i.e., coprime in pairs) to avoid amplifying periodic disorders in the input, and the overall sequence should aim for a small greatest common divisor, ideally 1, to prevent worst-case scenarios where subarrays remain poorly sorted. This even coverage ensures that distant elements are compared early, facilitating a smooth transition to finer sorting in later passes.^[20] Designing these sequences involves key trade-offs, such as the number of passes versus the size of individual gaps: longer sequences with more smaller gaps increase the total work per pass but can lead to better overall sortedness, while shorter sequences with larger initial gaps reduce passes but risk inefficient final stages.^[2] The goal is typically O(log n) passes to achieve sub-quadratic performance in practice, balancing computational overhead against the benefits of diminishing gap sizes.^[2] For instance, sequences emphasizing fewer passes may use ratios closer to 3, trading some evenness for speed, whereas those prioritizing uniformity opt for slightly smaller ratios around 2.2. To avoid worst-case pathologies, sequences must minimize the potential for input-dependent degradations, such as when gaps share large common factors that align with adversarial patterns; employing coprime gaps and leveraging number-theoretic tools like the Frobenius coin problem helps bound the maximum disorder propagated, ensuring no subarray experiences excessive swaps.^[2] This design choice is critical, as sequences with a greatest common divisor greater than 1 can trap elements in unsorted cycles, inflating the comparison count by up to a factor of the divisor. Early gap sequences, such as those proposed by Shell in 1959 and Hibbard in 1963, were derived theoretically based on intuitive progressions like multiples of 3 or powers of 2, aiming for simple asymptotic behavior without extensive testing.^[2] In contrast, later developments by Incerpi and Sedgewick in the 1980s and Ciura in 2001 shifted toward empirical tuning through simulations on random permutations, optimizing for average-case metrics like comparisons and assignments on arrays up to several thousand elements.^[20] These simulation-driven approaches revealed that theoretical sequences often underperform in practice due to high constants, favoring hybrid designs that incorporate both geometric growth and data-specific refinements.^[2] Despite advances, the optimal gap sequence remains conjectural, with open problems centering on sequences that adapt to input distributions beyond uniform random data, potentially incorporating runtime analysis of disorder types.^[2] Post-2000 research has explored parameterized functions for gaps; for example, a 2023 study introduced efficient sequences based on grid-search optimization of parameters, potentially improving scalability for larger inputs, though surpassing empirically tuned sequences like Ciura's remains challenging.^[6] A 2021 analysis also examined gamma-parameterized gaps inspired by Ciura's method.^[21] As an example evaluation, for an array of size n=1000, the theoretical Hibbard sequence (powers of 2 minus 1) requires approximately 10 passes, while the empirically tuned Ciura sequence uses about 8 passes, demonstrating how simulation-optimized designs reduce overhead without sacrificing effectiveness.^[20]

Analysis

Time Complexity

The time complexity of Shellsort depends heavily on the chosen gap sequence and input characteristics, with no tight universal bound established. In the worst case, it ranges from subquadratic to O(n^2) depending on the gaps, while a general lower bound of \Omega\left(n \left(\frac{\log n}{\log \log n}\right)^2\right) comparisons holds regardless of the sequence.^[2]^[3] For specific gap sequences, performance improves with better designs. The original Shell sequence using powers of two yields a worst-case O(n^2), though it can approach O(n^{3/2}) in some analyses but performs worse in practice due to limited mixing. The Hibbard sequence (gaps of the form 2^k - 1) achieves a worst-case O(n^{3/2}). The Sedgewick sequence provides a worst-case O(n^{4/3}). However, the average-case complexity for practical sequences like Sedgewick's remains an open theoretical problem, though empirical studies indicate strong subquadratic performance.^[2] A sketch of the derivation reveals why bounds vary. The algorithm performs a series of insertion sorts on disjoint subarrays for each gap h. With approximately n/h subarrays of length h, the worst-case cost per pass is O((n/h) \cdot h^2) = O(n h), as each insertion sort on a subarray of size h costs O(h^2). The total complexity is thus O\left(n \sum h_i\right) over all gaps h_i in the sequence; for geometrically decreasing gaps where \sum h_i = O(\sqrt{n}), this yields O(n^{3/2}). In the average case, inversion counting for random inputs reduces shifts, leading to O(n) per pass and subquadratic totals for good sequences.^[2] Approximate comparisons per pass further illustrate this: for gap h, the n/h subarrays each require about h/2 comparisons on average (half the elements shifted in insertion sort), totaling roughly n/2 per pass across all subarrays. Summing over log n passes gives an ideal O(n \log n), though actual costs are higher due to uneven distributions.^[2] The best-case complexity is O(n \log n) for nearly sorted inputs, where minimal shifts occur per pass. Average-case performance for strong sequences approximates O(n^{1.3}), while worst-case inputs can be engineered for near-O(n^2) with adversarial gaps. Input randomness and sequence quality are key factors; empirical studies show Shellsort 2-5 times slower than quicksort on average but far faster than bubble sort for lists beyond 100 elements.^[2]^[22]^[23]

Space Complexity and Stability

Shellsort operates as a fully in-place algorithm, requiring only O(1) auxiliary space beyond the input array itself. This is achieved through the use of a constant number of variables, such as loop indices and a temporary variable for element swaps during comparisons.^[24]^[25] The algorithm's implementation avoids recursion entirely and does not allocate extra arrays or data structures for intermediate results, distinguishing it from algorithms like merge sort that demand O(n) additional space for merging. This in-place nature makes Shellsort particularly advantageous in environments with limited memory availability, such as embedded systems or large datasets where auxiliary storage is prohibitive.^[24] Regarding stability, Shellsort is not stable, meaning that equal elements may change their relative order in the sorted output compared to the input. Instability occurs primarily due to swaps in early passes with large gaps, where elements separated by the gap distance—potentially including equal keys—can be exchanged if their current positions violate the sorting order within their subarray. For instance, consider two identical elements positioned such that one precedes the other by the gap h; during the h-sort phase, the algorithm may shift them past each other to resolve inversions in interleaved subarrays.^[26]^[27] While the insertion sort applied to each subarray during an h-sort phase preserves the relative order of equal elements locally within that subarray, the overall process interleaves multiple such subarrays, allowing global reordering of equals across different groups as gaps decrease. This contrasts with basic insertion sort, which is inherently stable because it only performs adjacent swaps and thus never inverts equal elements. Shellsort deliberately sacrifices this stability to enable faster corrections of distant inversions, enhancing its efficiency over insertion sort for larger, less ordered inputs.^[24]^[27] To achieve stability when required, modified variants of Shellsort can incorporate stable sorting routines (such as a stability-preserving insertion sort) for each subarray phase, though this typically increases the time overhead due to additional checks for equal elements.^[28]

Applications and Comparisons

Practical Applications

Shellsort is particularly valued in embedded systems due to its in-place nature, which requires no additional memory allocation beyond the input array, and its avoidance of recursive calls that could lead to stack overflows in resource-constrained environments.^[7] Its implementation simplicity further suits these platforms, enabling efficient handling of moderately sized datasets without the overhead of more complex algorithms.^[29] In legacy software, Shellsort was formerly integrated into the Linux kernel's generic sorting routines to replace recursive qsort implementations, ensuring robustness in kernel-space operations with limited stack resources.^[29] Similarly, the bzip2 compression utility utilizes Shellsort to sort block data, avoiding recursion depth issues that could disrupt compression in constrained environments like embedded file systems.^[7] Shellsort is often incorporated into hybrid sorting strategies, such as combining it with quicksort for subarrays smaller than 50 elements, to mitigate quicksort's potential recursion depth problems while leveraging Shellsort's efficiency on small, nearly sorted inputs.^[30] This approach enhances overall performance in general-purpose libraries by switching to Shellsort at low thresholds, reducing overhead from quicksort's partitioning on tiny segments.^[30] Its non-recursive structure also aids in environments requiring bounded stack usage, such as task scheduling in real-time operating systems. Additionally, Shellsort serves as an educational tool for illustrating adaptive sorting concepts, demonstrating how incremental gap reductions improve upon basic insertion sort in introductory algorithms courses.^[23] Shellsort continues to be used in the uClibc library for embedded systems. It also serves as a subroutine in hybrid algorithms like introsort for sorting small subarrays. In practice, Shellsort performs effectively for array sizes between 100 and 10,000 elements, where empirical benchmarks indicate it completes in approximately 10-20% of the time required by bubble sort on random data, owing to its sub-quadratic behavior.^[23] Despite these strengths, Shellsort is rarely the first choice in modern applications, as optimized library implementations like introsort in standard C++ or Java libraries generally outperform it through hybrid techniques tailored for larger datasets.^[30] Nonetheless, it remains valuable for understanding incremental algorithmic improvements and serves as a building block in custom sorting solutions for niche constraints.^[7]

Comparisons to Other Sorting Algorithms

Shellsort generalizes insertion sort by applying it to interleaved subarrays defined by decreasing gap sizes, enabling early swaps between distant elements that reduces the total number of inversions and improves the worst-case time complexity from O(n²) to sub-quadratic bounds, such as O(n^{4/3}) for certain gap sequences like Sedgewick's increments.^[1]^[31] Compared to quicksort, Shellsort is fully in-place and non-recursive, avoiding pivot selection overhead and worst-case O(n²) behavior from poor partitions, though its average-case performance is typically 2–5 times slower due to higher constant factors in comparisons and swaps.^[31] Empirical tests on random inputs show quicksort outperforming Shellsort variants for arrays larger than 10,000 elements; for example, on 75,000 integers, quicksort takes 20.8 seconds while a divisor-based Shellsort requires 32.4 seconds on comparable hardware (1987 benchmarks).^[31] However, Shellsort offers more predictable runtime without randomization needs.

Array Size	Shellsort (Divisors)	Quicksort
1,000	1.5 s	1.7 s
5,000	2.9 s	2.6 s
10,000	4.6 s	3.8 s
35,000	14.7 s	10.5 s
75,000	32.4 s	20.8 s

Table: Execution times (seconds) for sorting random integers on 1987 hardware, adapted from shaker sort variants approximating standard Shellsort performance.^[31] Like merge sort, Shellsort achieves sub-quadratic average time complexity, such as O(n^{4/3}) for certain gap sequences, but it uses constant extra space versus merge sort's O(n) auxiliary array requirement, making it preferable in memory-constrained environments.^[2] Shellsort's non-sequential access patterns, however, can lead to poorer cache performance compared to merge sort's sequential merges, and it is not stable while merge sort preserves relative order of equal elements.^[2] In practice, on datasets up to 42,000 elements, Shellsort execution times are comparable or slightly faster than merge sort in some empirical tests, but merge sort scales better for very large inputs. Heapsort shares Shellsort's in-place nature and O(n log n) worst-case bound, both being comparison-based with no recursion, but Shellsort is often faster in practice for small to medium arrays (under 10,000 elements) due to its adaptive behavior on partially sorted data and simpler implementation without heap maintenance overhead.^[32] For larger arrays, such as 1 million elements, both perform poorly in unoptimized implementations (e.g., over 2000 seconds in Python on modern hardware), though optimized versions and hybrids outperform them.^[32] Neither is stable, but Shellsort's gap-based passes make it more responsive to input distribution. Empirical studies indicate Shellsort requires roughly 3n log₂ n comparisons in practice for good gap sequences, compared to about 1.4n log₂ n for quicksort, highlighting the higher constant factor that contributes to its relative slowness despite sub-O(n²) scaling.^[2] Shellsort is chosen for its simplicity and low overhead in embedded systems or non-general-purpose code where recursion or extra space is undesirable, whereas quicksort, merge sort, or heapsort are preferred for large-scale sorting needing optimal speed or stability.^[32] In modern languages like C++, the standard library's std::sort—implementing introsort (a quicksort-heapsort hybrid)—consistently outperforms pure Shellsort by factors of 2–10x on typical workloads, but Shellsort remains useful for custom constraints like limited stack space or sorting small buffers.^[33]

References

[1]
A high-speed sorting procedure | Communications of the ACM
A high-speed sorting procedure. In a recent note1, D. L. Shell has described a high-speed sorting procedure for lists contained in internal memory. · Sorting by ...
[2]
[PDF] Analysis of Shellsort and related algorithms - Robert Sedgewick
This is an abstract of a survey talk on the theoretical and empirical studies that have been done over the past four decades on the Shellsort algorithm.
[3]
[PDF] the worst case in shellsort and related algorithms - MIT Mathematics
Shellsort is a general-purpose sorting algorithm that was invented by Shell in 1959 [14]. Empirical results show that it is competitive with the fastest ...
[4]
[PDF] Insertion Sort and Shell Sort
Simple sorting algorithm. – n-1 passes over the array. – At the end of pass i ... – invented by computer scientist Donald Shell in 1959. • based on some ...Missing: High- Speed Procedure
[5]
None
### Summary of Shellsort from the Document
[6]
[2301.00316] Optimization Perspectives on Shellsort - arXiv
Jan 1, 2023 · Abstract:Shellsort is a sorting method that is attractive due to its simplicity, yet it takes effort to analyze its efficiency.
[7]
[PDF] optimization perspectives on shellsort - arXiv
Jan 1, 2023 · ABSTRACT. Shellsort is a sorting method that is attractive due to its simplicity, yet it takes effort to analyze its efficiency.
[8]
https://dl.acm.org/doi/10.1016/j.tcs.2025.115122
[9]
[PDF] Tight Lower Bounds for Shellsort* - Robert Sedgewick
A file has no inversions iff it is sorted, and exchanging two adjacent elements that are out of sequence removes exactly one inversion. To facilitate our ...Missing: gaps reduce source
[10]
[PDF] An empirical study of the gap sequences for Shell sort
Abstract. We present an improved version of the Shell sort algorithm. Using the al- gorithm, we study various geometrical sequences and the performance of ...
[11]
[PDF] A New Upper Bound for Shellsort - Robert Sedgewick
The previous best-known upper bound for sequences of O(log N) increments was 0( N312), which was shown by Pratt to be tight for a large family of sequences, ...Missing: gap | Show results with:gap
[12]
[PDF] Shellsort and Sorting Networks - DTIC
In this paper we answer some open questions about the speed of Shellsort with certain characteristic sequences, and suggest a novel application of Shellsort, ...
[13]
Best Increments for the Average Case of Shellsort - ResearchGate
Aug 7, 2025 · The increment sequence used in the parallel shellsort phase is obtained from Ciura (2001) . Each increment in the Marcin Ciura's increment ...
[14]
[PDF] Average-Case Complexity of Shellsort
We thank Don Knuth, Ian Munro, and Vaughan Pratt for discussions and ref- erences on Shellsort. References. 1. H. Buhrman, T. Jiang, M. Li, and P. Vitanyi ...
[15]
13.15. An Empirical Comparison of Sorting Algorithms - OpenDSA
Shellsort is clearly superior to any of these O(n2) sorts for lists of even 100 records. Optimized Quicksort is clearly the best overall algorithm for all but ...
[16]
2.1 Elementary Sorts - Algorithms, 4th Edition
May 3, 2021 · Shellsort is a simple extension of insertion sort that gains speed by allowing exchanges of entries that are far apart, to produce partially ...
[17]
The Complexity of Shellsort | Baeldung on Computer Science
Mar 18, 2024 · In each pass, Shellsort sorts individual chains using Insertion Sort, reduces the gap, and sorts the new chains in the subsequent pass. The ...
[18]
Shell Sort - GeeksforGeeks
Oct 27, 2025 · Shell Sort was invented by Donald Shell in 1959 and is considered the first algorithm to break the O(n²) time complexity barrier for sorting.Missing: paper | Show results with:paper
[19]
General Reference Material | Stable Sorting Algorithms
ShellSort is not stable and it is not easy to make is so, since adjacent values will appear on different sublists on early passes and may be moved in different ...<|separator|>
[20]
Shell sort and insertion sort - algorithm - Stack Overflow
Jan 31, 2013 · You can implement insertion sort as a series of comparisons and swaps of contiguous elements. That makes it a "stable sort". Shell sort ...Insertion sort much faster than shell sort - Stack OverflowWhat is the benefit of Shell Sort versus Insertion/Bubble Sort?More results from stackoverflow.com
[21]
Randomized Shellsort: A Simple Data-Oblivious Sorting Algorithm
Aug 6, 2025 · For a gap sequence of powers of two, the modified Shell sort with input array length n is found to have the trade-off between the running ...Missing: seminal | Show results with:seminal
[22]
[PDF] On the Average-Case Complexity of Shellsort
For 3-pass Shell- sort Knuth and Janson [3], building on the work of Yao [15], gave an upper bound on the average-case complexity and the new lower bound ...Missing: publication | Show results with:publication
[23]
(PDF) HYBRID QUICKSORT: AN EMPIRICAL STUDY - ResearchGate
Aug 10, 2025 · HYBRID QUICKSORT: AN EMPIRICAL STUDY. October 2013; CommIT (Communication ... shellsort will be done. Algorithm 7: Shellsort algorithm.
[24]
Shell Sort: Algorithm, Example, Complexity, Code - WsCube Tech
Feb 11, 2025 · Shell sort in data structure is a way to sort a list of numbers by comparing elements that are far apart and then gradually reducing the gap between them.Missing: 2020s | Show results with:2020s
[25]
[PDF] PRACTICAL VARIATIONS OF SHELLSORT Janet INCERPI
Sep 15, 1987 · We now describe our Shellsort variant, which we call shaker sort. The algorithm is based on a sequence of integer increments, h,, . . . , h,.
[26]
[PDF] Comparative Analysis of the Performance of Popular Sorting ...
Our results reveal important trade-offs among the sorting algorithms. While some algorithms excel in certain scenarios, others demonstrate better scalability or ...Missing: seminal | Show results with:seminal
[27]
[PDF] RESULTS OF COMPARISON BETWEEN DIFFERENT SORTING ...
The shell sort is still significantly slower than the merge, heap, and quick sorts, but its relatively simple algorithm makes it a good choice for sorting lists ...
[28]
Which sorting algorithm would be best fo - C++ Forum
practically you want to use the built in std::sort. second practically, shellsort is good for small arrays, platforms that don't support recursion or have ...