Fact-checked by Grok 2 weeks ago

Quicksort

Quicksort is a comparison-based sorting algorithm that employs a divide-and-conquer strategy to efficiently arrange elements in an array or list. Developed by British computer scientist C. A. R. Hoare in 1959 while working on machine translation projects in Moscow, it was first published in 1961 and has since become one of the most widely used sorting methods due to its balance of simplicity and performance. The core mechanism involves selecting a "pivot" element, partitioning the data so that all elements smaller than the pivot precede it and all larger elements follow, then recursively applying the process to the resulting sub-arrays until the entire structure is sorted.^[1]^[2] In terms of computational complexity, Quicksort exhibits an average-case time complexity of O(n log n), where n is the number of elements, making it highly efficient for large datasets; this arises from the expected balanced partitioning that halves the problem size logarithmically. However, its worst-case performance degrades to O(n²) when the pivot consistently results in highly unbalanced partitions, such as in already-sorted input without randomization. To address this, practical implementations often incorporate techniques like random pivot selection or median-of-three partitioning, which select the pivot as the median of the first, middle, and last elements to improve balance and reduce the likelihood of pathological cases. These enhancements can yield running times as low as approximately 1.39 n log n comparisons on average for random inputs.^[3]^[4] Quicksort operates in-place, requiring O(log n) additional space on average beyond the input array due to the recursion stack, which contrasts with algorithms like mergesort that need additional memory proportional to n. It is not stable, meaning that the relative order of equal elements may change, but this trade-off contributes to its speed and cache efficiency on modern hardware. Variants such as 3-way quicksort further optimize handling of duplicate elements by partitioning into three groups (less than, equal to, and greater than the pivot), achieving entropy-optimal performance for inputs with many repeats.^[3]^[5] The algorithm's influence extends to its adoption in numerous standard library implementations, including the qsort function in the C standard library and hybrid quicksort variants in Java's Arrays.sort. Despite alternatives like heapsort offering guaranteed O(n log n) worst-case performance, Quicksort remains preferred in practice for its superior average-case speed, low constant factors, and adaptability to real-world data distributions. Its invention marked a significant advancement in algorithmic efficiency, earning Hoare the 1980 Turing Award partly for this contribution.^[6]^[7]

History

Origins and Development

Quicksort was invented by C. A. R. (Tony) Hoare in 1959 while serving as a visiting student at Moscow State University, where he was engaged in a machine translation project involving the sorting of Russian dictionary words on magnetic tape.^[8] The motivation stemmed from the inefficiencies of existing sorting algorithms like bubble sort and Shellsort, which exhibited quadratic time complexity, prompting Hoare to devise a more efficient in-place method suitable for the limited resources of contemporary hardware. Upon returning to England and joining Elliott Brothers, Hoare implemented and tested the algorithm on the Elliott 803 computer, aiming to outperform merge sort by minimizing auxiliary storage and optimizing for random-access memory.^[8] Hoare conceived the core idea of partitioning an array around a pivot element during a moment of reflection in his Moscow dormitory, sketching the partition procedure in Mercury Autocode.^[8] This recursive approach divided the problem into smaller subproblems, with the pivot serving as the basis for rearranging elements. The algorithm received its first formal description in a 1961 publication titled "Algorithm 64: Quicksort" in the Communications of the ACM, where Hoare presented it as a novel sorting technique for computer random-access stores, demonstrating superior speed and low memory usage compared to alternatives like insertion sort and tree sort. Early implementations, including Hoare's original, selected the first element as the pivot for simplicity, but this choice caused significant performance degradation on sorted or reverse-sorted inputs, as it produced unbalanced partitions and led to near-quadratic runtime.^[9] Additionally, the inherently recursive structure posed challenges on early computers with shallow call stacks and limited memory, risking overflow for deep recursion on large datasets. To address these issues, subsequent evolutions incorporated iterative variants and optimizations, such as tail recursion elimination, to better suit the hardware constraints of the era.^[8]

Key Publications and Contributors

The seminal publication on quicksort is C. A. R. Hoare's 1962 paper, which formally introduced the algorithm, provided a detailed description of its partitioning scheme, and proved its average-case time complexity of approximately 1.386 n log n comparisons. This work built on Hoare's earlier 1961 algorithmic specification in Communications of the ACM, establishing quicksort as a practical and efficient sorting method for computer memory. Hoare's contributions to algorithms, including quicksort, were recognized with the 1980 ACM Turing Award, though the award highlighted his broader impact on programming languages and verification. Early refinements to address worst-case performance issues, such as unbalanced partitions from poor pivot selection, were introduced by R. C. Singleton in 1969, who proposed the median-of-three pivot strategy—selecting the pivot as the median of the first, middle, and last elements in the subarray—to improve average performance and reduce the likelihood of degenerate cases. This optimization has since become a standard feature in many quicksort implementations. In the 1970s, Robert Sedgewick advanced the understanding and practical application of quicksort through extensive empirical analysis and implementation studies, including his 1975 Ph.D. thesis at Stanford University and a 1978 Communications of the ACM paper that evaluated variants like median-of-three partitioning and small-array optimizations, demonstrating significant performance gains on real hardware.^[10]^[11] Sedgewick's work emphasized hybrid approaches, influencing later developments such as introsort, though his direct contributions focused on analytical models and code tuning rather than the full hybrid algorithm proposed by David Musser in 1997. Quicksort's theoretical foundations were further solidified in Donald Knuth's 1973 volume of The Art of Computer Programming, which provided a rigorous probabilistic analysis and compared it to other sorting algorithms, cementing its status as a cornerstone of computer science. Later practical advancements came from Jon Bentley and M. Douglas McIlroy in their 1993 paper "Engineering a Sort Function," which described an optimized quicksort variant for C libraries incorporating median-of-three pivots, three-way partitioning for duplicates, and small-subarray handling via insertion sort, achieving superior speed and robustness in production systems like BSD Unix.^[12] This implementation influenced standard library sorts in languages including C, C++, and Java.

Core Algorithm

Step-by-Step Procedure

Quicksort is a divide-and-conquer sorting algorithm that operates by selecting a pivot element from the array and partitioning the other elements into two subarrays around the pivot, with all elements less than or equal to the pivot placed before it and all greater than the pivot placed after it. The pivot is then in its final sorted position, and the two subarrays are recursively sorted. This process continues until the subarrays contain only one or zero elements, at which point they are considered sorted. The algorithm was originally described by C. A. R. Hoare in 1961 as a method for efficient sorting in random-access memory.^[2] The basic recursive procedure can be expressed in pseudocode as follows, where the array is sorted in-place between indices low and high:

function quicksort([array](/page/Array), low, high)
    if low < high
        pivot_index = partition([array](/page/Array), low, high)
        quicksort([array](/page/Array), low, pivot_index - 1)
        quicksort([array](/page/Array), pivot_index + 1, high)
function quicksort([array](/page/Array), low, high)
    if low < high
        pivot_index = partition([array](/page/Array), low, high)
        quicksort([array](/page/Array), low, pivot_index - 1)
        quicksort([array](/page/Array), pivot_index + 1, high)

Here, the partition function rearranges the subarray such that all elements less than or equal to the pivot are to its left and all greater are to its right, returning the pivot's final index; specific partitioning methods are discussed elsewhere. This structure assumes the reader is familiar with arrays as contiguous data structures and basic recursion, where each call handles a smaller subproblem until the base case (when low >= high) is reached, avoiding further subdivision. The original implementation in ALGOL 60 by Hoare follows this recursive template, emphasizing its simplicity for machine implementation.^[2] Quicksort performs sorting in-place, meaning it requires only a constant amount of additional storage beyond the original array, as partitioning swaps elements within the array itself rather than creating new copies. The recursion depth, which is logarithmic on average but can reach linear in the worst case, directly influences both the time complexity (through the number of partitioning steps) and the auxiliary space usage for the call stack, making balanced partitions crucial for efficiency. Hoare's 1962 analysis highlights how this recursive partitioning leads to an expected O(n log n) comparisons for n elements, underscoring the performance implications of recursion depth.

Partitioning Process

The partitioning process is the core mechanism of Quicksort that rearranges elements in a subarray around a chosen pivot, placing the pivot in its correct sorted position such that all elements to the left are less than or equal to it and all elements to the right are greater than it.^[2] This division enables the recursive sorting of the two resulting subarrays, exploiting the divide-and-conquer paradigm to achieve efficiency.^[4] The process begins with selecting a pivot element from the subarray, which serves as the reference value for partitioning; common choices include the first, last, or middle element, though the exact method can vary without altering the fundamental mechanics.^[2] The partitioning is performed using swaps to rearrange the elements relative to the pivot, ensuring the required ordering invariant. Specific partitioning schemes, such as the Hoare or Lomuto methods, differ in their pointer usage and swap logic but achieve the same goal; these are detailed in the Partition Schemes section.^[5]^[4] This operation is performed in-place, requiring only constant additional space beyond the original array, as swaps modify the array directly without auxiliary storage.^[2] The partitioning step examines each element in the subarray a constant number of times, ensuring a time complexity of O(n, where n is the size of the subarray, providing a balanced foundation for the algorithm's overall performance.^[4]

Partition Schemes

Lomuto Scheme

The Lomuto partition scheme is a partitioning method used in quicksort that employs a single index to separate elements smaller than the pivot from those larger or equal, while fixing the pivot at the end of the subarray. Attributed to Nico Lomuto of Alsys, Inc., and first described in detail by Jon Bentley in his 1984 Communications of the ACM column, this scheme simplifies implementation by avoiding the dual-pointer approach of earlier methods.^[13]^[14] In this scheme, the pivot is selected as the last element of the subarray from index low to high. An index i is initialized to low - 1 to mark the boundary of elements known to be smaller than the pivot. A scanning index j then iterates from low to high - 1. For each element at j, if it is smaller than the pivot, i is incremented, and the elements at i and j are swapped to place the smaller element in the correct region. After the loop completes, the pivot is swapped with the element at i + 1, positioning it correctly, and i + 1 is returned as the partition index. This process ensures all elements before the returned index are smaller than the pivot, and all after are greater or equal.^[13]^[15] The following pseudocode illustrates the Lomuto partition function for an array arr of integers:

int partition(int arr[], int low, int high) {
    int pivot = arr[high];
    int i = low - 1;
    for (int j = low; j < high; j++) {
        if (arr[j] < pivot) {
            i++;
            swap(arr[i], arr[j]);
        }
    }
    swap(arr[i + 1], arr[high]);
    return i + 1;
}
int partition(int arr[], int low, int high) {
    int pivot = arr[high];
    int i = low - 1;
    for (int j = low; j < high; j++) {
        if (arr[j] < pivot) {
            i++;
            swap(arr[i], arr[j]);
        }
    }
    swap(arr[i + 1], arr[high]);
    return i + 1;
}

This implementation assumes a swap function that exchanges two array elements.^[13] The Lomuto scheme offers advantages in code simplicity and readability, making it particularly suitable for educational purposes and beginners, as it requires only one loop and fewer variables than bidirectional schemes.^[15] However, it is slightly less efficient in terms of swaps, as each qualifying element may require two assignments (temporary storage in the swap), leading to potentially more operations compared to schemes that minimize moves.^[16]^[17]

Hoare Scheme

The Hoare partition scheme, developed by C. A. R. Hoare as part of the original quicksort algorithm, utilizes two indices that traverse the subarray from opposite ends toward the center to rearrange elements around a chosen pivot. Introduced in 1961, this method selects an arbitrary element—often the first in the subarray—as the pivot and aims to partition the array such that all elements to the left of the final partition point are less than or equal to the pivot, while those to the right are greater than or equal to it, though the pivot itself may not end up in its sorted position immediately after partitioning.^[2] The algorithm initializes the left index i to one position before the start of the subarray and the right index j to one position after the end. It then increments i until it finds an element greater than or equal to the pivot and decrements j until it finds an element less than or equal to the pivot. If the indices have not crossed (i < j), the elements at i and j are swapped, and the process repeats. Once i \geq j, the function returns j as the partition index, without an additional swap for the pivot, allowing the recursive quicksort calls to handle subarrays that may slightly overlap around the pivot's location. This design ensures a single pass through the subarray while minimizing unnecessary operations.^[2] Here is the pseudocode for the Hoare partition scheme:

int partition(int arr[], int low, int high) {
    int pivot = arr[low];
    int i = low - 1;
    int j = high + 1;
    while (true) {
        do {
            i++;
        } while (arr[i] < pivot);
        do {
            j--;
        } while (arr[j] > pivot);
        if (i >= j) {
            return j;
        }
        swap(arr[i], arr[j]);
    }
}
int partition(int arr[], int low, int high) {
    int pivot = arr[low];
    int i = low - 1;
    int j = high + 1;
    while (true) {
        do {
            i++;
        } while (arr[i] < pivot);
        do {
            j--;
        } while (arr[j] > pivot);
        if (i >= j) {
            return j;
        }
        swap(arr[i], arr[j]);
    }
}

The scheme's efficiency stems from its dual-pointer approach, which performs approximately three times fewer swaps than the Lomuto partition scheme on average for random inputs—around n/6 swaps—reducing the overall computational overhead.^[18] However, the lack of immediate pivot placement requires the recursive calls to encompass slightly larger subarrays that overlap, potentially complicating implementation compared to schemes that fix the pivot in place during partitioning.^[2]

Implementation Aspects

Pivot Selection Strategies

The choice of pivot in quicksort is crucial for achieving balanced partitions and avoiding degenerate cases that lead to quadratic time complexity. Simple strategies, such as selecting the first, last, or middle element of the subarray, are easy to implement but perform poorly on ordered inputs, where they produce unbalanced partitions and recursion depths approaching O(n).^[4] To address these limitations, the median-of-three method selects the pivot as the median value among the first, middle, and last elements of the subarray. This heuristic provides a closer approximation to the true median, reducing the probability of extreme imbalances and improving average-case performance by about 5-15% in comparisons compared to fixed-position selection. The approach was systematically analyzed and advocated by Robert Sedgewick in his empirical studies of quicksort variants.^[19]^[4] Randomized pivot selection mitigates adversarial inputs by choosing a uniformly random element from the subarray as the pivot. This variation ensures an expected time complexity of O(n log n) for any input, with the expected number of comparisons bounded by approximately 1.386 n log₂ n.^[20] For larger arrays, the ninther strategy refines median-of-three by partitioning the subarray into non-overlapping groups of nine elements, computing the median of each group via median-of-three, and then selecting the median of those group medians as the pivot. This method yields an even better approximation of the array's median, further decreasing the chance of poor partitions in practice. Bentley and McIlroy developed and empirically validated the ninther in their engineering of an optimized quicksort for C libraries, where it contributed to robust performance across diverse inputs.^[21] Overall, effective pivot selection balances computational overhead with partition quality; suboptimal choices can result in O(n²) recursion depths by consistently favoring one-sided splits, underscoring the need for heuristics tailored to input characteristics.

Handling Repeated Elements

In standard quicksort implementations using two-way partitioning, such as the Lomuto or Hoare schemes, a high frequency of repeated elements can lead to unbalanced partitions. If the pivot value appears many times, all duplicates may be placed on one side of the pivot, resulting in subarrays of sizes that degrade performance toward the worst-case O(n²) time complexity, even for non-adversarial inputs.^[21] To mitigate this issue, 3-way partitioning techniques group elements into three categories relative to the pivot: those strictly less than the pivot, those equal to the pivot, and those strictly greater than the pivot. This approach, equivalent to solving the Dutch National Flag problem, was refined and advocated by Bentley and McIlroy in their engineering-focused analysis of sorting algorithms.^[21] The method employs three pointers to maintain invariants during the scan: one for the boundary of the less-than region, one for the current scanning position, and one for the boundary of the greater-than region. Elements are swapped as needed to expand the less-than and greater-than segments while skipping or centralizing equals, ensuring duplicates are collected contiguously without unnecessary recursion. A representative pseudocode implementation of the 3-way partitioning step, based on a standard Dutch National Flag variant adapted for quicksort (commonly used in practice, e.g., by Sedgewick), operates on an array a from indices low to high with pivot value v (typically chosen via a strategy like median-of-three):^[3]

lt = low
gt = high
i = low

while i <= gt:
    if less(a[i], v):
        swap(a, lt, i)
        lt += 1
        i += 1
    elif greater(a[i], v):
        swap(a, i, gt)
        gt -= 1
    else:  # a[i] == v
        i += 1
lt = low
gt = high
i = low

while i <= gt:
    if less(a[i], v):
        swap(a, lt, i)
        lt += 1
        i += 1
    elif greater(a[i], v):
        swap(a, i, gt)
        gt -= 1
    else:  # a[i] == v
        i += 1

After partitioning, the elements from low to lt-1 are less than v, from lt to gt are equal to v, and from gt+1 to high are greater than v. Recursive calls are made only on the less-than and greater-than subarrays (sort(a, low, lt-1) and sort(a, gt+1, high)), skipping the equal segment. This technique provides significant benefits in datasets with many duplicates, achieving linear O(n) time for the partitioning step when all elements equal the pivot, as the entire array becomes the middle segment with no further subdivision required.^[21] It is incorporated in Java's dual-pivot quicksort implementation, where the three-part partitioning around the two pivots similarly groups duplicates in the central interval to maintain efficiency.^[22]

Common Optimizations

One common optimization for quicksort implementations involves eliminating tail recursion to mitigate stack overflow risks and reduce space overhead in the worst case. In the standard recursive quicksort, the algorithm makes two recursive calls after partitioning: one for the subarray of elements less than the pivot and another for elements greater than the pivot. By restructuring the code to make the recursive call on the larger subarray first and then using an iterative loop for the smaller subarray, only one recursive call remains at each step, allowing compilers or manual optimization to reuse the current stack frame instead of allocating a new one. This tail call optimization transforms the worst-case stack depth from O(n) to O(log n), making it suitable for large inputs without excessive memory use.^[23] Another widely adopted enhancement is hybridizing quicksort with insertion sort for small subarrays, which addresses the higher constant factors in quicksort's partitioning for tiny partitions. When the subarray size falls below a threshold—typically between 5 and 15 elements, with values around 10 being common in practice—the recursion switches to insertion sort, which has better performance due to fewer comparisons and simpler operations on small, nearly sorted data. This cutoff improves overall runtime by leveraging insertion sort's O(n^2) efficiency on small inputs while retaining quicksort's advantages for larger ones, often yielding 10-20% speedups in empirical tests on random data.^[19] Introsort, introduced by David Musser in 1997, combines quicksort with heapsort as a fallback to guarantee O(n log n) worst-case time while preserving quicksort's average-case speed. The algorithm monitors recursion depth and switches to heapsort if it exceeds approximately 2 log_2 n levels, preventing the O(n^2) degradation from poor pivot choices like sorted inputs. This hybrid approach is now standard in libraries such as the C++ Standard Template Library's std::sort, where it balances introspection of partitioning behavior with guaranteed bounds, achieving near-optimal performance across diverse inputs without sacrificing in-place properties.^[24] Cache-friendly modifications, such as those in BlockQuicksort by Edelkamp and Weiß (2016), further enhance practical efficiency by improving memory locality and reducing branch mispredictions. BlockQuicksort partitions elements in blocks of fixed size (e.g., cache line length) to align data accesses with hardware cache lines, minimizing cache misses during swaps and comparisons. Experimental evaluations show it sorts random integers up to 80% faster than standard quicksort on modern processors, primarily due to better temporal and spatial locality, while maintaining the same asymptotic complexity.^[25]

Complexity Analysis

Worst-Case Behavior

The worst-case time complexity of quicksort is \Theta(n^2), which arises when the partitioning step consistently produces highly unbalanced subarrays, such as one subarray containing all but one element and the other being empty.^[26] This scenario is particularly evident in implementations that always select the first or last element as the pivot.^[26] A classic example occurs with inputs that are already sorted in ascending or descending order. In an ascending sorted array with the first element as pivot, each partition places the pivot at the beginning of the subarray, resulting in a left subarray of size n-1 and a right subarray of size 0.^[26] Similarly, a descending sorted array with the last element as pivot leads to analogous imbalance.^[26] Adversarial inputs, crafted to exploit deterministic pivot choices, can also force this degenerate partitioning repeatedly.^[26] Under these conditions, the recurrence relation for the time complexity T(n) simplifies to T(n) = T(n-1) + \Theta(n), since the partitioning step takes linear time and one recursive call dominates.^[26] Solving this recurrence yields T(n) = \Theta(n^2), specifically approximately n^2/2 comparisons in the unbalanced case.^[26] To avoid this worst-case performance and guarantee O(n \log n) time even in the worst case, pivot selection can be improved using randomization, which yields an expected O(n \log n) runtime, or the median-of-medians algorithm for deterministic linear-time pivot approximation.^[26]

Average-Case Analysis

The average-case analysis of Quicksort relies on the assumption that the input array is a uniform random permutation of distinct elements, or equivalently, that the pivot is chosen uniformly at random from the current subarray at each step. This probabilistic model ensures that the position of the pivot after partitioning is equally likely to be any index from 1 to n, leading to balanced expected subproblem sizes.^[19] Under this assumption, the expected running time T(n) for an array of size n satisfies the recurrence
T(n) = \frac{1}{n} \sum_{q=1}^{n} \left[ T(q-1) + T(n-q) \right] + \Theta(n),
with base cases T(0) = T(1) = \Theta(1). The \Theta(n) term accounts for the linear-time partitioning step. Solving this recurrence via standard techniques, such as expanding it into a summation over subproblem sizes or using the master theorem for divide-and-conquer relations, yields T(n) = O(n \log n). This establishes that Quicksort's average performance matches the information-theoretic lower bound for comparison-based sorting.^[19] A more refined analysis targets the expected number of key comparisons, which dominates the running time. One approach solves the recurrence directly for the comparison count C(n):
C(n) = \frac{1}{n} \sum_{q=1}^{n} \left[ C(q-1) + C(n-q) \right] + (n-1),
with C(0) = C(1) = 0. The exact solution is C(n) = 2(n+1) H_n - 4n, where H_n = \sum_{k=1}^n 1/k is the nth harmonic number. Asymptotically, H_n \approx \ln n + \gamma (with \gamma \approx 0.57721 the Euler-Mascheroni constant), so C(n) \approx 2n \ln n. In base-2 logarithms common to computer science, this equates to approximately 1.386 n \log_2 n comparisons, since 2 \ln n = 2 (\ln 2) \log_2 n and 2 \ln 2 \approx 1.386.^[19] An alternative method uses indicator random variables to count comparisons via linearity of expectation. For each pair of distinct elements (i,j) with i < j, define an indicator I_{i,j} = 1 if i and j are compared directly (i.e., one is chosen as pivot before any element between them). The probability P(I_{i,j}=1) = 2/(j-i+1), as the first pivot in their range is equally likely to be any of the j-i+1 elements, and only the endpoints cause a comparison between i and j. The expected total comparisons is then \sum_{i<j} 2/(j-i+1) = 2 \sum_{k=2}^n (n-k+1)/k = 2(n+1) H_n - 4n, matching the recurrence solution. This probabilistic technique highlights the uniformity assumption's role in bounding expectations without recursion.^[19]

Space Requirements

Quicksort is an in-place sorting algorithm, requiring only O(1) auxiliary space for the partitioning process itself, as it rearranges elements within the original array without needing additional storage proportional to the input size.^[27] The primary source of additional memory usage in the standard recursive implementation arises from the recursion stack, which stores the call frames for each recursive invocation. In the average case, with balanced partitions, the recursion depth is O(log n), leading to O(log n) space complexity for the stack.^[28] However, in the worst case, unbalanced partitions—such as when the pivot is consistently the smallest or largest element—result in a recursion depth of O(n), yielding O(n) space complexity.^[27] Iterative versions of quicksort replace recursion with an explicit stack or queue to manage subarray partitions, potentially reducing space requirements. While standard iterative implementations still require O(log n) space on average for the explicit stack, more sophisticated variants can achieve O(1) extra space overall by carefully managing partition boundaries without storing subproblem ranges, though these are significantly more complex to implement.^[29] Hybrid approaches like introsort, which combine quicksort with heapsort for worst-case guarantees, maintain similar space characteristics to standard quicksort, with O(log n) recursion stack space in practice due to enforced depth limits, but may incur a slight increase from the heapsort fallback's in-place operations.^[24]

Variants and Extensions

Multi-Pivot Approaches

Multi-pivot quicksort variants extend the classical algorithm by selecting multiple pivots to partition the array into more than two segments, which can lead to more balanced recursions and improved average-case performance on random data. These approaches aim to reduce the expected recursion depth while maintaining the overall O(n log n) time complexity, though they often incur higher constant factors due to additional comparisons during partitioning. The dual-pivot quicksort, introduced by Vladimir Yaroslavskiy in 2009, represents a key advancement in this category. It selects two pivots—initially the first and last elements of the subarray, later refined to positions near the middle (e.g., at indices left + (right - left) / 3 and right - (right - left) / 3)—and partitions the array into three intervals: elements less than the smaller pivot, elements between the two pivots, and elements greater than the larger pivot. This three-way partitioning is performed using three pointers that traverse the subarray, swapping elements to maintain the invariants while minimizing swaps compared to single-pivot methods. Yaroslavskiy's algorithm incorporates optimizations such as switching to insertion sort for small subarrays (typically size less than 27) to reduce overhead. The partitioning pseudocode is as follows:

function dualPivotPartition(A, left, right):
    p = A[left]
    q = A[right]
    if p > q:
        swap A[left], A[right]
        p = A[left]
        q = A[right]
    ℓ = left + 1
    g = right - 1
    k = ℓ
    while k ≤ g:
        if A[k] < p:
            swap A[ℓ], A[k]
            ℓ += 1
            k += 1
        elif A[k] > q:
            while A[g] > q and k < g:
                g -= 1
            swap A[k], A[g]
            g -= 1
            if A[k] < p:
                swap A[ℓ], A[k]
                ℓ += 1
            k += 1
        else:
            k += 1
    swap A[left], A[ℓ - 1]
    swap A[right], A[g + 1]
    return (ℓ - 1, g + 1)
function dualPivotPartition(A, left, right):
    p = A[left]
    q = A[right]
    if p > q:
        swap A[left], A[right]
        p = A[left]
        q = A[right]
    ℓ = left + 1
    g = right - 1
    k = ℓ
    while k ≤ g:
        if A[k] < p:
            swap A[ℓ], A[k]
            ℓ += 1
            k += 1
        elif A[k] > q:
            while A[g] > q and k < g:
                g -= 1
            swap A[k], A[g]
            g -= 1
            if A[k] < p:
                swap A[ℓ], A[k]
                ℓ += 1
            k += 1
        else:
            k += 1
    swap A[left], A[ℓ - 1]
    swap A[right], A[g + 1]
    return (ℓ - 1, g + 1)

The recursive calls then sort the three resulting subarrays independently. This variant was selected as the default sorting method in Oracle's Java 7 runtime library and subsequent versions, replacing the previous single-pivot implementation due to its empirical advantages. On random inputs of 2,000,000 integers, Yaroslavskiy's dual-pivot quicksort completed in 16.5 seconds, compared to 18.9 seconds for a classical single-pivot variant and 20.3 seconds for the JDK 6.0 implementation, demonstrating up to 25% speedup in favorable cases through fewer swaps (approximately 0.8 n ln n versus n ln n for single-pivot) while keeping comparisons at about 2 n ln n. Theoretical analysis confirms an expected O(n log n) running time with improved constants, particularly for uniformly distributed data, as the dual pivots reduce the variance in partition sizes.^[22] Extensions to triple-pivot quicksort build on this idea by using three pivots—selected at the first quartile, median, and third quartile positions—to divide the array into four segments: less than the smallest pivot, between the smallest and middle pivot, between the middle and largest pivot, and greater than the largest pivot. Partitioning employs four pointers that advance toward the center, swapping elements based on comparisons to the ordered pivots (p < q < r), which enhances balance but requires more pointer management and comparisons. Experimental evaluations show triple-pivot variants achieving 7-8% faster runtime than dual-pivot on random data, attributed to 25% fewer cache misses and slightly lower expected comparisons (approximately 1.846 n ln n versus 1.9 n ln n), though they involve more swaps (0.615 n ln n) and higher implementation constants that can diminish gains on small arrays or specific architectures. Like dual-pivot, the expected time complexity remains O(n log n), with benefits most pronounced for distributions where single-pivot methods suffer from imbalance.

External and Specialized Variants

External quicksort adapts the quicksort algorithm for sorting data that exceeds available main memory, relying on disk storage to handle large files. The algorithm reads blocks of data into memory, selects a pivot from the block, and partitions the data into separate output files based on comparisons with the pivot, minimizing disk I/O operations by processing data in chunks rather than individual records. This block-based partitioning approach reduces the number of disk seeks compared to naive implementations, achieving performance competitive with external merge sort for certain workloads, though it may require more passes over the data in unbalanced cases. Verkamo's formulation demonstrates that external quicksort performs well with faster disks, showing less sensitivity to block and file sizes than merge-based alternatives.^[30] Three-way radix quicksort combines quicksort's partitioning with radix sort's digit-by-digit processing, making it particularly effective for sorting strings or multi-digit integers. It selects a pivot string and performs a three-way partition on the characters at a specific position (starting from the most significant digit or character), dividing elements into those less than, equal to, and greater than the pivot's character, then recurses on the less-than and greater-than partitions while advancing the position for equal elements. This handles repeated prefixes efficiently by grouping equal substrings together, avoiding redundant comparisons in subsequent levels. The algorithm achieves linearithmic time on average for fixed-length keys and outperforms standard quicksort by up to four times on string data due to reduced comparison costs and better cache utilization.^[31] BlockQuicksort is a cache-optimized variant designed to mitigate branch mispredictions in quicksort's partitioning loop, which can degrade performance on modern processors with deep prediction pipelines. Instead of comparing individual elements, it processes small blocks of array elements (typically 8-16 items) as units, using masked comparisons and conditional block swaps to maintain partition invariants without frequent branches. This approach improves instruction-level parallelism and data locality, reducing misprediction penalties while preserving quicksort's in-place nature and average O(n log n) complexity. Experimental evaluations show BlockQuicksort outperforming standard quicksort implementations like std::sort by up to 80% on random integer data, with gains more pronounced on larger arrays.^[25] Parallel quicksort extends the algorithm to multi-core systems by distributing workload across threads, often using sample sort for balanced partitioning or tree-based recursion for hierarchical division. In sample sort, a small random sample of elements is drawn to select multiple pivots (one per processor), which partition the entire array into roughly equal-sized buckets distributed to threads for independent recursive sorting, ensuring load balance and minimizing communication overhead. Tree-based variants build a recursion tree where partitions are assigned to idle threads dynamically, leveraging work-stealing for scalability on shared-memory machines. These methods achieve near-linear speedup on up to hundreds of cores, with sample sort demonstrating robustness across input distributions in parallel environments.^[32]

Comparisons and Relations

Links to Other Sorting Algorithms

Quicksort shares conceptual similarities with selection sort in its use of partitioning to reorganize elements around a chosen value, though selection sort repeatedly identifies and swaps the minimum element into position without recursing on both subarrays, leading to a consistent O(n²) time complexity, whereas quicksort achieves O(n log n) on average by dividing and conquering both partitions.^[4] In comparison to mergesort, both algorithms employ a divide-and-conquer paradigm, but quicksort performs partitioning in-place without requiring additional space for merging, offering lower memory overhead and practical speed advantages due to reduced per-comparison costs, while mergesort guarantees stability and consistent O(n log n) performance at the expense of extra storage.^[4] Introsort represents a hybrid extension of quicksort, integrating it with heapsort to ensure worst-case O(n log n) performance; it monitors recursion depth and switches to heapsort if the limit is exceeded, mitigating quicksort's potential O(n²) degradation while retaining its average-case efficiency.^[24] Quickselect, an adaptation of quicksort's partitioning mechanism, focuses on finding the k-th smallest element in an unordered array with average O(n time complexity, recursing only on the relevant subarray containing the target rather than both sides as in full sorting.^[4]

Generalizations and Hybrids

Quickselect is a generalization of quicksort that finds the k-th smallest element in an unsorted array, known as the k-th order statistic, without fully sorting the data. Developed by Tony Hoare as Algorithm 65 (Find), it employs the same partitioning mechanism as quicksort: select a pivot, partition the array into elements less than, equal to, and greater than the pivot, then recurse only on the subarray containing the k-th element. This approach achieves an average time complexity of O(n, making it efficient for selection problems where full sorting is unnecessary.^[33] Hybrids of quicksort with other paradigms extend its partitioning to specialized domains, such as string sorting. Multi-key quicksort, a fusion with radix sort principles, handles multi-dimensional keys or strings by comparing keys character-by-character during partitioning, using three-way splits to manage equal elements efficiently.^[34] Originally proposed by Bentley and Sedgewick, this three-way radix quicksort variant partitions strings into less-than, equal, and greater-than groups based on successive key positions, reducing comparisons for common prefixes and achieving high performance on text data.^[34] Three-way radix quicksort further refines this for strings with many duplicates, as detailed by Sedgewick, by integrating radix-like digit examination into quicksort's recursive framework.^[31] Partial quicksort addresses the need to sort only a prefix of the array, such as the l smallest elements in a list of n items, by combining quickselect's find operation with quicksort on the relevant subarray. Introduced by Conrado Martínez, it partitions the array and recurses only on the portion needed for the smallest l elements, yielding an expected time of O(n + l log l). This is particularly useful for priority queues, where maintaining the top k elements—such as the k largest items—can leverage quicksort's partitioning to avoid full sorts, as in selecting the k-th largest via a single partition and scanning the tail.^[35] Incremental quicksort provides an online variant for dynamically updating sorted outputs as new elements arrive, suitable for streaming data.^[36] Paredes and Navarro's optimal incremental sorting algorithm extends quicksort by incrementally partitioning and emitting the next smallest element, achieving O(m + k log k) time to output the first k elements from a set of m arrivals.^[36] This maintains balanced partitions dynamically, allowing efficient insertion and re-partitioning for continuous streams without restarting the sort.^[36]

References

[1]
Sir Antony Hoare - CHM - Computer History Museum
In 1959, while studying machine translation of languages in Moscow, he invented the now well-known sorting algorithm, "Quicksort." Hoare returned to England ...
[2]
Algorithm 64: Quicksort | Communications of the ACM
Algorithm 64: Quicksort. article. Free access. Share on. Algorithm 64: Quicksort. Author: C. A. R. Hoare. C. A. R. Hoare. Elliott brothers Ltd., Hertfordshire ...
[3]
Quicksort - Algorithms, 4th Edition
Mar 9, 2022 · Quicksort is a divide-and-conquer method for sorting. It works by partitioning an array into two parts, then sorting the parts independently.
[4]
[PDF] Quicksort - Robert Sedgewick
A complete study is presented of the best general purpose method for sorting by computer: C. A. R. Hoare's Quicksort algorithm. Special.<|control11|><|separator|>
[5]
Tony Hoare >> Contributions >> Quicksort
During this course, Hoare programmed an ultra-fast sorting algorithm now known as quicksort. His first paper on quicksort was also published in 1961, with ...History · The Algorithm · Classification · Ideal Uses
[6]
CS107 Guide to C stdlib functions - Stanford University
This is a generic implementation of quicksort that uses a void* interface. Note the last argument is function pointer, which is where the client supplies their ...
[7]
[PDF] Chapter 7: Quicksort
Quicksort is a divide-and-conquer sorting algorithm in which division is dynamically carried out (as opposed to static division in. Mergesort). The three steps ...
[8]
[PDF] An Interview with - Tony Hoare - A.M. Turing Award - ACM
Nov 24, 2015 · Quicksort instead. Rather shyly I went up to the dais on which he was sitting and. 266 showed it to him. He looked at it for a bit and he ...
[9]
https://dl.acm.org/doi/10.1145/355578.366204
[10]
[PDF] a brief summary of facts learned since the 1960s - Robert Sedgewick
Quicksort remains the method of choice. C.A.R. Hoare invented it in 1959 and wrote a definitive paper introducing it in 1962. "There's one good thesis left ...Missing: original | Show results with:original
[11]
Implementing Quicksort programs | Communications of the ACM
This paper is a practical study of how to implement the Quicksort sorting algorithm and its best variants on real computers.Missing: contributions introsort
[12]
Engineering a sort function - Bentley - 1993 - Wiley Online Library
We recount the history of a new qsortfunction for a C library. Our function is clearer, faster and more robust than existing sorts.
[13]
Programming pearls
Programming Pearls the comparison as follows. for I := 2 to N do ... simple scheme that I learned from Nico Lomuto of Al- sys, Inc. There are ...
[14]
[PDF] A Detailed Analysis of Quicksort Algorithms with Experimental ...
May 2, 2019 · It has been a commonly used algorithm for sorting since then and is still widely used in industry. The main idea for Quicksort is that we choose ...<|control11|><|separator|>
[15]
An Overview of QuickSort Algorithm | Baeldung on Computer Science
Aug 30, 2024 · It was originally developed by Tony Hoare and published in 1961 and it's still one of the more efficient general-purpose sorting algorithms ...<|control11|><|separator|>
[16]
Simple and Fast BlockQuicksort using Lomuto's Partitioning Scheme
Oct 29, 2018 · This paper presents simple variants of the BlockQuicksort algorithm described by Edelkamp and Weiss (ESA 2016). The simplification is achieved by using Lomuto' ...Missing: Nico 1979
[17]
Branchless Lomuto Partitioning - orlp.net
Dec 4, 2023 · Branchless Lomuto-style algorithms always do at least two moves for each element, whereas Hoare-style algorithms can get away with doing a single move for each ...
[18]
[PDF] Simple and Fast BlockQuicksort using Lomuto's Partitioning Scheme
Oct 30, 2018 · permutations of the set {1,...,n}, it makes three times more swaps than Hoare's partitioning scheme [19] and “scans” 50% more elements [20].Missing: advantages | Show results with:advantages
[19]
[PDF] The analysis of Quicksort programs - Robert Sedgewick
Jan 19, 2025 · Hoare presented a new algorithm called Quicksort [7, 8] which is suitable for putting files into order by computer. This method combines.Missing: introsort | Show results with:introsort
[20]
[PDF] QuickSort A Historical Perspective and Empirical Study
Dec 5, 2007 · The survey examines in detail all the different variations of Quicksort starting with the original version developed by Hoare in 1961 and ending ...
[21]
[PDF] Engineering a Sort Function - Gallium
SUMMARY. We recount the history of a new qsort function for a C library. Our function is clearer, faster and more robust than existing sorts.
[22]
[PDF] Dual-Pivot Quicksort algorithm - CodeBlab
Dual-Pivot Quicksort algorithm. Vladimir Yaroslavskiy iaroslavski@mail.ru. First revision: February 16, 2009. Last updated: September 11, 2009. Introduction.Missing: paper | Show results with:paper
[23]
[PDF] Performance Benefits of Tail Recursion Removal in Procedural ...
Jun 2, 2001 · In this paper we provide a general discussion of tail recursion removal in procedural languages, a topic which has largely been neglected in the ...
[24]
[PDF] Introspective Sorting and Selection Algorithms
Quicksort is the preferred in-place sorting algorithm in many contexts, since its average computing time on uniformly distributed inputs is (N logN) and it is ...
[25]
BlockQuicksort: How Branch Mispredictions don't affect Quicksort
Apr 22, 2016 · Authors:Stefan Edelkamp, Armin Weiß. View a PDF of the paper titled BlockQuicksort: How Branch Mispredictions don't affect Quicksort, by ...
[26]
[PDF] An Analysis of the Worst-Case Performance of Quicksort
The original quicksort was described by C.A.R. Hoare in Algorithms 63 and 64 of the Collected ... AN ANALYSIS OF THE WORST-CASE PERFORMANCE OF QUICKSORT. 4.1 ...
[27]
Quicksort Algorithm
The quicksort algorithm is a divide and conquer algorithm. Quicksort first divides a large array into two smaller sub-arrays: the low elements and the high ...
[28]
[PDF] QuickSort - UCSB Mathematics Department
Oct 22, 2014 · ▷ Time Complexity= n * Depth = O(n*n). ▷ Space Complexity= Depth = O(n). Time. Space. Best. O(nlogn) O(logn). Worst. O(n). Average. O(n. 2. ).
[29]
[PDF] Quicksort in Constant Space
Feb 2, 1991 · The standard Quicksort algorithm takes O(log N) space, e.g. stack space for recursion. algorithm correct.
[30]
[PDF] Sorting Strings with Three-Way Radix Quicksort - Robert Sedgewick
Apr 10, 2021 · This month, we'll examine a "three-way radix quicksort" algorithm that applies the general approach of quicksort character-by-character. A.
[31]
[PDF] Parallel Sorting by Regular Sampling†
The pivots, which divides the regular sample into ordered subsets of sample of equal size, will also partition the original data into order subsets of roughly ...
[32]
[PDF] Forty Years of 'Quicksort' and 'Quickselect': a Personal View - Inria
Oct 6, 2003 · The sorting algorithm 'quicksort' is based on the divide-and-conquer principle. The algorithm proceeds as follows. First a pivot is chosen, with ...
[33]
[PDF] Fast Algorithms for Sorting and Searching Strings - Robert Sedgewick
Multikey Quicksort orders a set of n vectors with k components each. Like regular Quicksort, it partitions its input into sets less than and greater than a.Missing: original | Show results with:original
[34]
Find k largest elements in an array - GeeksforGeeks
Jul 23, 2025 · The idea is to use the partitioning step of QuickSort to find the k largest elements in the array, without sorting the entire array.
[35]
[PDF] Optimal Incremental Sorting ∗ - DCC UChile
In this paper we present the IncrementalSelect. (IS) algorithm, which solves the online problem in op- timal O(m + k log k) time. We also present Incre-.