Counting Sort / Radix Sort
Counting sort achieves O(n + k) time by counting occurrences of each value (k = range of values), avoiding comparisons entirely. Radix sort applies counting sort digit by digit for O(d(n+k)) where d is the number of digits.
Counting sort = sorting M&Ms by color. Don't compare them -- just count how many of each color, then line them up. Blazing fast if few colors (small k), useless if every candy is unique.
Counting sort: create count array of size k, count occurrences, compute cumulative sums, build output array right-to-left (for stability). Requires integer (or discretizable) keys with bounded range. Radix sort: sort by least significant digit first (LSD) using stable counting sort, repeat for each digit. O(d*(n+k)) — for 32-bit integers: d=4 passes in base 256 (2^8), k=256 — often faster than O(n log n) in practice for large n.
LSD radix sort is stable and correct because each pass with a stable sort on digit i preserves the relative order established by digits i-1, ..., 0. MSD radix sort (most significant first) can sort in-place and terminate early for strings, but is more complex. Counting sort is non-comparison-based — it beats the O(n log n) lower bound by exploiting integer keys rather than comparing elements, but is only practical when k = O(n). For comparison-based sorts, the O(n log n) lower bound (from information theory — there are n! permutations, each comparison yields 1 bit, so you need at least log₂(n!) ≈ n log n bits) cannot be beaten.
I use counting sort when keys are integers in a small bounded range and n is large — for example, sorting ages, scores out of 100, or characters in a string (O(n + 26)). I use radix sort when keys are integers or fixed-length strings and comparison-based sorts are too slow. The key trade-off: both are non-comparison-based and break the O(n log n) barrier but require additional assumptions about the data (bounded integers). In interviews, mentioning these shows you understand that O(n log n) is not always the optimal answer.
Counting sort requires ALL values to be non-negative integers (or mapped to such). For negative numbers, shift all values by min(array). Also: counting sort is only faster than O(n log n) when k = O(n). If k = O(n²), counting sort is O(n²) and worse than comparison sorts.