Counting sort
Counting sort is a non-comparison-based sorting algorithm designed for sorting a collection of integers within a small, known range, achieving linear time complexity by counting the occurrences of each possible value and using these counts to place elements in their correct positions in the output array.[1] The algorithm assumes the input consists of n elements drawn from the set {0, 1, ..., k-1}, where k is the range of key values, and it operates by first initializing a count array of size k to zero, then incrementing the count for each input element, computing cumulative sums to determine positions, and finally building the sorted array by placing elements in reverse order to maintain stability.[2] This process ensures the relative order of equal elements is preserved, making it a stable sort.[1]
Invented by Harold H. Seward in 1954 as part of early work on digital computer applications for business operations at MIT, counting sort laid foundational techniques for efficient integer sorting and was integrated into radix sort implementations to handle multi-digit keys.[3] Its time complexity of Θ(n + k) makes it optimal when k is O(n), outperforming comparison-based sorts like quicksort or mergesort in such scenarios, though it requires additional space proportional to k and is impractical for large or unknown ranges without modifications.[2] Counting sort is often used as a subroutine in more advanced algorithms, such as radix sort for larger integers, and remains a key example in algorithm education for illustrating space-time tradeoffs in sorting.[4]
Fundamentals
Description
Counting sort is a non-comparative integer sorting algorithm that determines the sorted order of elements by first counting the occurrences of each possible value in the input and then using these counts, along with arithmetic operations, to directly compute and place each element in its final position in the output array.[3] This approach was originally described as a subroutine within digital sorting methods by Harold H. Seward in his 1954 master's thesis at MIT.[3]
Unlike comparison-based algorithms that rely on pairwise element comparisons to establish order, counting sort leverages the discrete nature of the input values and a known bounded range, employing array indices directly corresponding to those values to avoid comparisons altogether.[1] It operates by initializing a count array sized to the input range, incrementing entries for each input value's frequency, and then transforming this into a cumulative distribution that specifies output positions.[1]
As a distribution-based sorting method, counting sort partitions elements into temporary storage based on their values—effectively using the count array as a set of buckets—and reconstructs the sorted sequence by placing elements according to the cumulative counts, akin to a single-pass variant of the distribution step in radix sort.[1] This design enables efficient handling of discrete data when the value range is not excessively large, though it involves a space trade-off proportional to that range.[1]
Assumptions
Counting sort requires a specific set of input conditions to function correctly and efficiently. The input is an array of n elements, where each element is a non-negative integer with values ranging from 0 to k, and k represents the maximum value in the array, which must be known beforehand to determine the size of the auxiliary count array.[5] This known range allows for the allocation of a count array of size k+1, enabling the algorithm to tally occurrences without exceeding memory bounds proportional to the input range.[6]
The output of counting sort is a new array of the same size n, containing the input elements rearranged in non-decreasing order.
The assumption of non-negative integers is crucial because negative values would correspond to invalid negative indices in the count array, which are not supported in standard array implementations. Although adaptations exist—such as shifting all values by adding the absolute value of the minimum element to map them to a non-negative range—this approach deviates from the standard algorithm and can inflate the effective range k, thereby reducing efficiency in both time and space when the input span is large.[7]
The Algorithm
Counting sort assumes the input consists of non-negative integers within a known range [0, k].[8]
The following pseudocode describes the stable version of the algorithm, using zero-based indexing for arrays. It operates on an input array A of length n, producing a sorted output array B of the same length, with a count array C of size k+1.[8][9]
procedure countingSort(A[0..n-1], B[0..n-1], k)
C ← array of size (k + 1), initialized to 0 // Count occurrences of each value
for i ← 0 to n-1 do
C[A[i]] ← C[A[i]] + 1 // Increment count for A[i]
for j ← 1 to k do
C[j] ← C[j] + C[j-1] // Compute cumulative counts for placement positions
for i ← n-1 downto 0 do
B[C[A[i]] - 1] ← A[i] // Place A[i] at the correct position in B
C[A[i]] ← C[A[i]] - 1 // Decrement the count to handle multiples
procedure countingSort(A[0..n-1], B[0..n-1], k)
C ← array of size (k + 1), initialized to 0 // Count occurrences of each value
for i ← 0 to n-1 do
C[A[i]] ← C[A[i]] + 1 // Increment count for A[i]
for j ← 1 to k do
C[j] ← C[j] + C[j-1] // Compute cumulative counts for placement positions
for i ← n-1 downto 0 do
B[C[A[i]] - 1] ← A[i] // Place A[i] at the correct position in B
C[A[i]] ← C[A[i]] - 1 // Decrement the count to handle multiples
This pseudocode initializes the count array C to zeros, which tracks the frequency of each possible value in the range [0, k]. The first loop increments C by 1 for each occurrence of value j in A, building the frequency distribution. The second loop transforms C into a cumulative sum array, where C now represents the number of elements less than or equal to j, providing the ending position for placing elements of value j in the sorted output. The third loop iterates backward through A to ensure stability—preserving the relative order of equal elements—by placing each A at position C[A] - 1 in B and then decrementing C[A] to adjust for subsequent identical elements.[8][9]
Step-by-Step Execution
To illustrate the execution of counting sort, consider an input array A of length n = 8 containing integers in the range $0 to k = 5: A = [2, 5, 3, 0, 2, 3, 0, 3]. This example follows the standard stable implementation of the algorithm as described in introductory algorithms texts.[10]
The process begins by initializing a count array C of size k+1 = 6 (indices 0 to 5) to all zeros. Then, for each element in A, the corresponding entry in C is incremented to record the frequency of each value:
Next, the count array is transformed into a cumulative count array by setting C = C + C[i-1] for i = 1 to k, where C{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} remains unchanged. This step computes the starting position in the output array for each value, ensuring elements appear in non-decreasing order:
| Value | 0 | 1 | 2 | 3 | 4 | 5 |
|---|
| Cumulative | 2 | 2 | 4 | 7 | 7 | 8 |
An output array B of size n is then initialized (typically to empty or a placeholder). To produce a stable sort, the input array is scanned from right to left. For each element A, it is placed at position C[A] - 1 in B, and C[A] is decremented. This backward pass preserves the relative order of equal elements. The placements proceed as follows:
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=7&&&citation_type=wikipedia}} = 3, place 3 at position 6 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=3&&&citation_type=wikipedia}} to 6.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=6&&&citation_type=wikipedia}} = 0, place 0 at position 1 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} to 1.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=5&&&citation_type=wikipedia}} = 3, place 3 at position 5 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=3&&&citation_type=wikipedia}} to 5.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=4&&&citation_type=wikipedia}} = 2, place 2 at position 3 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}} to 3.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=3&&&citation_type=wikipedia}} = 0, place 0 at position 0 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} to 0.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}} = 3, place 3 at position 4 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=3&&&citation_type=wikipedia}} to 4.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=1&&&citation_type=wikipedia}} = 5, place 5 at position 7 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=5&&&citation_type=wikipedia}} to 7.
- For A{{grok:render&&&type=render_inline_citation&&&citation_id=0&&&citation_type=wikipedia}} = 2, place 2 at position 2 in B, decrement C{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}} to 2.
The final output array is B = [0, 0, 2, 2, 3, 3, 3, 5].[10]
Throughout this execution, positions for elements in the output are determined directly from the precomputed counts and cumulative sums, without any direct comparisons between input values.[10]
Analysis
Time Complexity
The time complexity of counting sort is O(n + k), where n is the length of the input array and k is the number of possible key values (i.e., one more than the difference between the maximum and minimum possible values in the input). This bound holds regardless of the input distribution, as the algorithm performs a fixed set of operations independent of the specific arrangement of elements.[1]
The analysis breaks down as follows: the initial counting phase iterates over the n input elements exactly once to tally occurrences in the auxiliary array, requiring O(n) time; the cumulative sum phase (or prefix sum computation for stable placement) iterates over the k possible values in the auxiliary array, requiring O(k) time; and the output phase iterates over the n input elements once more to construct the sorted array, again requiring O(n) time.[1] Initialization of the auxiliary array also takes O(k) time, which is subsumed in the overall O(k) term.
More precisely, the exact running time can be expressed as T(n, k) = n + k + o(1), reflecting the linear traversal costs with lower-order terms for overhead. When k = O(n), this simplifies to linear time O(n), making counting sort particularly efficient for inputs with bounded integer keys.[1] As a non-comparison-based sorting algorithm, counting sort avoids the \Omega(n \log n) lower bound that applies to comparison sorts, enabling its linear performance in suitable scenarios.
Space Complexity
The space complexity of counting sort is O(n + k), where n is the number of elements to sort and k is the range of input values (specifically, the difference between the maximum and minimum keys plus one). This arises primarily from the count array, which requires O(k) space to tally frequencies of each possible key value, and the output array, which needs O(n) space to build the sorted result without overwriting the input.[1]
More precisely, the total auxiliary space is given by S(n, k) = n + k + o(1), where the o(1) term captures minor overhead from temporary variables and indices.[1]
A key trade-off occurs when k \gg n, as the space then scales with k, rendering counting sort inefficient or infeasible for datasets with expansive key ranges, such as when keys span up to n^3 or more.[1]
Variants
Stable Variant
A stable sorting algorithm preserves the relative order of records with equal keys as they appear in the sorted output compared to the input.[11]
The standard implementation of counting sort fills the output array by traversing the input from beginning to end using cumulative counts, which places earlier occurrences of equal keys toward the end of their group in the output, thereby reversing their relative order and making the sort unstable.[10] To ensure stability, the algorithm is modified to traverse the input array in reverse order during the placement phase, assigning positions such that earlier input elements with equal keys receive lower indices in the output.[9]
This reverse traversal works because the cumulative count array tracks the rightmost available position for each key initially; processing later input elements first places them at higher indices within the key group, leaving lower indices for earlier elements encountered subsequently.[10]
The pseudocode for the stable variant, adapted to zero-based indexing, highlights the backward loop:
COUNTING-SORT(A, B, k)
1 let [count](/page/Count)[0..k] be a new array
2 for i ← 0 to k
3 [count](/page/Count)[i] ← 0
4 for j ← 0 to n - 1
5 [count](/page/Count)[A[j]] ← [count](/page/Count)[A[j]] + 1
6 for i ← 1 to k
7 [count](/page/Count)[i] ← [count](/page/Count)[i] + [count](/page/Count)[i - 1]
8 for j ← n - 1 downto 0
9 B[[count](/page/Count)[A[j]] - 1] ← A[j]
10 [count](/page/Count)[A[j]] ← [count](/page/Count)[A[j]] - 1
COUNTING-SORT(A, B, k)
1 let [count](/page/Count)[0..k] be a new array
2 for i ← 0 to k
3 [count](/page/Count)[i] ← 0
4 for j ← 0 to n - 1
5 [count](/page/Count)[A[j]] ← [count](/page/Count)[A[j]] + 1
6 for i ← 1 to k
7 [count](/page/Count)[i] ← [count](/page/Count)[i] + [count](/page/Count)[i - 1]
8 for j ← n - 1 downto 0
9 B[[count](/page/Count)[A[j]] - 1] ← A[j]
10 [count](/page/Count)[A[j]] ← [count](/page/Count)[A[j]] - 1
The loop at lines 8–10 iterates backward over the input, decrementing the count after each placement to fill positions from right to left within equal-key groups.[11]
Generalized Variant
Counting sort can be generalized to sort objects with arbitrary keys by first mapping those keys to a bounded set of non-negative integers, enabling the use of the standard counting mechanism on the mapped values.[12] This mapping ensures a one-to-one correspondence between the original keys and integers within a feasible range [0, k], where k is the number of distinct keys or buckets. A common application of this generalized form is as a stable subroutine in radix sort, where counting sort processes individual digits or characters of the keys in multiple passes to achieve overall sorting of larger integers or strings.[13] Each pass treats the digit positions as small integer keys, mapping them directly to counts without needing additional hashing.
To handle negative integers, which fall outside the non-negative assumption of basic counting sort, an offset is applied by adding the absolute value of the minimum element to all keys, shifting the range to start at zero. For example, given keys {-3, -1, 0, 2}, the minimum is -3, so add 3 to each: {0, 2, 3, 5}; after counting and placement, subtract 3 during output to restore originals.[14] This preserves the linear time complexity as long as the shifted range remains manageable.
For floating-point numbers, counting sort is adapted by bucketing the values into discrete integer ranges, such as by scaling or flooring to map continuous values to a finite set of integers representing intervals (e.g., dividing the range [min, max] into k equal buckets and assigning each float to its bucket index).[15] This discretization transforms the problem into integer counting, though it introduces approximation if exact ordering within buckets is needed, often requiring a secondary sort on those buckets.
Applications and Limitations
Applications
Counting sort plays a crucial role in radix sort, particularly for sorting multi-digit numbers, where it serves as the stable subroutine applied to each digit position to achieve overall linear time sorting for fixed-length keys.[16]
The algorithm is well-suited for sorting data with small fixed ranges, such as student grades from 0 to 100, enabling efficient organization of educational records without comparison overhead.[17] Similarly, in image processing, counting sort is suitable for sorting pixel intensities from 0 to 255, as these values lie within a small discrete range ideal for linear-time sorting methods.[18]
As a preprocessing step in data structures, counting sort computes frequency distributions, which is essential for histogram generation and subsequent analysis, such as in statistical profiling or data summarization where the count array directly represents occurrence tallies for discrete values.[19]
Limitations
Counting sort is fundamentally limited to sorting integers within a known, bounded range [0, k], rendering it inapplicable to continuous data types such as real numbers or unbounded domains where the range cannot be predefined.[20] This restriction arises because the algorithm relies on array indexing to count occurrences of each possible value, which is infeasible for non-discrete or infinite sets.[21]
When the range k significantly exceeds the input size n (i.e., k >> n), counting sort suffers from excessive space usage due to the O(k) count array, leading to substantial memory waste for sparse or widely distributed keys.[20] This space explosion makes the algorithm impractical in such scenarios, as the auxiliary storage requirements can dominate practical implementations.[21]
Additionally, counting sort is not an in-place algorithm in its standard form, necessitating O(n + k) extra memory for the count and output arrays, which contrasts with space-efficient alternatives for general-purpose sorting.[20]
For general cases without the integer-range assumption, comparison-based algorithms like quicksort or mergesort are preferable, as they handle arbitrary data types and orders without proportional space overhead to the key range.
Historical Development
Origins
Counting sort was invented by Harold H. Seward in 1954 while he was a graduate student at the Massachusetts Institute of Technology (MIT).[22]
Seward first described the algorithm in his master's thesis titled Information Sorting in the Application of Electronic Digital Computers to Business Operations, published as MIT Digital Computer Laboratory Report R-232 in May 1954.[3] In this work, he presented counting sort—referred to as a "floating digital sort"—as an efficient method for internal sorting of data represented by decimal digits, particularly suited for the hardware constraints of early electronic computers.[3]
The development occurred during the early computing era, when data processing in business applications relied heavily on punched card systems, such as those from IBM, which processed cards at rates like 450 per minute per digit using electric brushes to detect punched holes.[3] Seward's algorithm addressed the need for streamlined sorting on limited hardware, including magnetic tape units and the Whirlwind computer project at MIT, by minimizing passes over the data and leveraging direct frequency-based distribution rather than comparisons.[3]
Preceding mechanical techniques, such as manual "Keysort" cards that used binary sorting with needles to separate punched cards by holes, also informed the digit-by-digit distribution central to the algorithm.[3]
Significance
Counting sort represents a pioneering example of non-comparative sorting algorithms, achieving linear time complexity O(n + k) under the assumption of a limited range of input values, which challenges the Ω(n log n) lower bound for general comparison-based sorts and has influenced the development of other linear-time sorting techniques.[23]
As a foundational building block, counting sort serves as a key subroutine in more advanced algorithms like radix sort and bucket sort, enabling efficient handling of larger or distributed datasets by breaking down sorting into digit-by-digit or bucket-level operations.[23][24]
In educational contexts, counting sort plays a crucial role in illustrating algorithm design trade-offs, such as the balance between time efficiency and space requirements, as well as the impact of input assumptions on performance, making it a staple in curricula for demonstrating non-comparison-based methods and stability in sorting.[23][25]
Its modern relevance persists in big data environments, where it supports distributed sorting for data with known distributions, such as through radix sort variants in MapReduce frameworks for partitioning and processing large-scale integer keys efficiently.[26][25]