Understanding Counting Sort in Detail
Counting Sort is a unique algorithm because, unlike Bubble Sort or Quick Sort, it does not compare elements to each other. Instead, it uses the **frequency** of values to place them in the correct position. It is an integer-based sorting algorithm that works by counting the number of objects having distinct key values.
Imagine you are a teacher with a stack of exam papers scored between 0 and 5. Instead of comparing every paper to every other paper to sort them, you simply take five empty baskets labeled 0, 1, 2, 3, 4, and 5. You go through the stack once, dropping each paper into its corresponding basket. Finally, you stack the papers from basket 0, then basket 1, and so on. The papers are now sorted perfectly without ever directly comparing one student's score to another.
The Logic Behind the Count
The core mechanism involves mapping values to index positions.
- Range Discovery: First, we determine the range of the input data (the maximum value,
k). - Frequency Count: We create a "Count Array" of size
k+1we see, we increment the value at indexxxin our Count Array. - Cumulative Count: We modify the Count Array so that each element at each index stores the sum of previous counts. This step effectively calculates the starting position of each number in the output array.
- Placement: We iterate through the input array one last time, looking up the correct position in the Count Array and placing the element into the final output array.
Step-by-Step Algorithm Trace
Let's trace the algorithm on a small array.
Input Array: [4, 2, 2, 8, 3, 3, 1]
Range: Min = 1, Max = 8
Step 1: Frequency Count
- Index 1: 1 (one '1')
- Index 2: 2 (two '2's)
- Index 3: 2 (two '3's)
- Index 4: 1 (one '4')
- Index 8: 1 (one '8')
- All other indices (0, 5, 6, 7) are 0.
Count Array: [0, 1, 2, 2, 1, 0, 0, 0, 1]
Step 2: Cumulative Count
- Index 0: 0
- Index 1: 0 + 1 = 1
- Index 2: 1 + 2 = 3
- Index 3: 3 + 2 = 5
- Index 4: 5 + 1 = 6
- ...
- Index 8: ... = 7
Cumulative Array: [0, 1, 3, 5, 6, 6, 6, 6, 7]
Step 3: Build Output
- Take the first number from Input: 4. Look at Cumulative Array index 4 -> value is 6. Place 4 at position 6 (technically index 5). Decrement count.
- Take next number: 2. Look at index 2 -> value is 3. Place 2 at position 3 (index 2). Decrement count.
- Repeat until the array is full.
Final Sorted Array: [1, 2, 2, 3, 3, 4, 8]
Code Implementation
Below is the implementation of Counting Sort in multiple programming languages.
#include <stdio.h>#include <stdlib.h>#include <string.h>void countingSort(int arr[], int n) {if (n <= 0) return;// Find max valueint max = arr[0];for (int i = 1; i < n; i++)if (arr[i] > max) max = arr[i];// Create and initialize count arrayint* count = (int*)calloc(max + 1, sizeof(int));int* output = (int*)malloc(n * sizeof(int));// Count occurrencesfor (int i = 0; i < n; i++)count[arr[i]]++;// Compute cumulative countfor (int i = 1; i <= max; i++)count[i] += count[i - 1];// Build output array (stable sort)for (int i = n - 1; i >= 0; i--) {output[count[arr[i]] - 1] = arr[i];count[arr[i]]--;}// Copy to original arrayfor (int i = 0; i < n; i++)arr[i] = output[i];free(count);free(output);}int main() {int arr[] = {4, 2, 2, 8, 3, 3, 1};int n = sizeof(arr) / sizeof(arr[0]);countingSort(arr, n);printf("Sorted array: ");for (int i = 0; i < n; i++)printf("%d ", arr[i]);return 0;}
Time & Space Complexity Analysis
Counting Sort is extremely fast but "expensive" in terms of memory if the numbers are large.
Time Complexity
The algorithm iterates through the input array (n) and the count array (k). If k is not significantly larger than n, this is effectively linear time, making it faster than Quick Sort.
Space Complexity
The algorithm needs an auxiliary array of size k. If you are sorting numbers between 1 and 1,000,000, you need an array of size 1,000,000, even if you only have 5 items to sort. This makes it inefficient for sparse data.
When to Use Counting Sort?
Counting Sort is a niche algorithm used in very specific high-performance scenarios.
Small Range of Integers
It is perfect for sorting data where the range of potential values is known and small, such as sorting a list of people by age (0-120) or sorting exam scores (0-100).
Linear Time Requirement
When you absolutely need O(n) performance and cannot afford the O(n log n) overhead of comparison sorts, Counting Sort is the solution—provided you have the memory to handle the range.
Subroutine for Radix Sort
Counting Sort is rarely used alone on large numbers. Instead, it is often used as a subroutine for Radix Sort, which can sort large numbers digit-by-digit.