Heap Internals (Min/Max Heap)
A heap is a complete binary tree stored as an array where every parent is ≤ (min-heap) or ≥ (max-heap) its children. Root is always the min (or max).
Heap = a tournament bracket stored in an array. The champion (min or max) is always at the top. New challengers bubble up; dethroned champs sink down. Parent is always the winner of its two children.
Array representation: for node at index i (0-based), left child = 2i+1, right child = 2i+2, parent = (i-1)//2. Operations: insert — append and bubble up — O(log n). Extract-min — swap root with last, remove last, bubble down — O(log n). Peek — O(1). Build heap from array (heapify) — O(n) via sift-down from last non-leaf, NOT O(n log n). Python heapq is a min-heap; for max-heap, negate values.
Heapify is O(n) because most nodes are near the bottom and do little work: the sift-down cost at height h is O(h), and there are O(n/2^h) nodes at height h. Summing: Σ (n/2^h)*h for h=0 to log(n) = O(n). This is the proof that BUILD-HEAP is O(n). D-ary heaps reduce tree height to log_d(n) and improve cache performance — used in Dijkstra implementations where decrease-key is frequent. Binomial and Fibonacci heaps support O(1) amortized decrease-key (needed for efficient Dijkstra). In practice, binary heap with lazy deletion (push new priority, ignore stale pops) beats Fibonacci heap due to constants.
A heap is a complete binary tree satisfying the heap property, stored compactly as an array using index arithmetic. The key insight is the array layout: it avoids pointer overhead and is cache-friendly. I use heaps when I need repeated access to the min or max of a dynamic collection. BUILD-HEAP is O(n), not O(n log n) — this is a common interview question. Python's heapq is a min-heap; I negate values for max-heap, or use a wrapper class with reversed comparison.
Python's heapq.heappush/heappop are correct but do NOT support decrease-key directly. For Dijkstra, use the lazy deletion pattern: push duplicate (new_dist, node) entries and skip processing stale entries on pop by checking if the popped distance matches the current best. Also: heapq operates on a LIST in Python — the list must be initialized as a heap via heapq.heapify() before using push/pop.