Overview
Heapify transforms an unsorted array into a binary heap in linear time. The standard procedure (often called Floyd’s method) performs bottom-up sift-down on each internal node, ensuring that the heap-order property holds. This is more efficient than inserting elements one by one (which costs O(n log n)), and it underpins fast heap construction for Heapsort and priority-queue initialization.
Core Idea
A binary heap is a complete binary tree stored in an array A[0..n-1] with parent/child relations:
-
left(i) = 2i + 1,right(i) = 2i + 2(0-indexed). -
Max-heap (shown here):
A[i] ≥ A[left(i)]andA[i] ≥ A[right(i)]when children exist. (For min-heap, reverse comparisons.)
If every subtree rooted at i is a heap, then doing a sift-down at i fixes violations at that node while preserving heap order below. By processing nodes from the last internal node up to the root, each sift-down traverses only as far as necessary (often a few levels), yielding O(n) total work.
Algorithm Steps / Pseudocode
Sift-down (max-heap)
function SIFT_DOWN(A, n, i):
// assumes subtrees at i are heaps; restores heap at i
while true:
L = 2*i + 1
R = 2*i + 2
largest = i
if L < n and A[L] > A[largest]:
largest = L
if R < n and A[R] > A[largest]:
largest = R
if largest == i:
break
swap A[i], A[largest]
i = largestBuild-heap (Floyd’s heapify)
function HEAPIFY(A, n):
// last internal node is floor(n/2) - 1 in 0-indexed arrays
for i in reverse(floor(n/2) - 1 .. 0):
SIFT_DOWN(A, n, i)Tip
For min-heap, replace the ”>`” comparisons with ”<” and the variable names accordingly.
Example or Trace
Array: A = [4, 10, 3, 5, 1, 8, 12] (n=7).
Internal nodes are indices 0..2 (since ⌊n/2⌋−1 = 2).
-
i=2(A[2]=3, children at 5 (8) and 6 (12)): swap with12→A = [4,10,12,5,1,8,3]. Sift-down ends (new index 6 is a leaf). -
i=1(A[1]=10, children 3 (5), 4 (1)): already ≥ children → no change. -
i=0(A[0]=4, children 1 (10), 2 (12)): swap with12→A=[12,10,4,5,1,8,3]. Continue at index 2 (children 5=8, 6=3): swap with8→A=[12,10,8,5,1,4,3]. End.
Final max-heap: [12,10,8,5,1,4,3]. The tree’s parent is always ≥ children.
Complexity Analysis
-
Time (Floyd’s heapify):
O(n). Why: A node at heighth(distance to a leaf) can trigger at mosthswaps. In a complete binary tree:-
≤
⌈n/2⌉nodes haveh=0, -
≤
⌈n/4⌉haveh=1, -
≤
⌈n/8⌉haveh=2, … Total work ≤∑_{h≥0} (n/2^{h+1})·h = n·∑_{h≥0} h/2^{h+1} = n·1 = O(n)(the series sums to 1).
-
-
Space:
O(1)extra; operations are in-place on the array. -
Compare with
ninserts: Repeatedpushinto a heap costsO(n log n)in total because each insert may bubble upO(log n)levels.
Optimizations or Variants
-
Top-down vs bottom-up: Bottom-up is strictly better for bulk build. Use top-down only if streaming elements one at a time.
-
Branch-reduced sift-down: Pick child index with a single compare (
choose child = argmax(A[L], A[R])ifR<n), then one compare against parent; reduces branches on hot paths. -
Pull-up technique: Save
x = A[i]; while child larger thanx, move child up; placexonce at final position. Cuts swaps roughly in half. -
k-ary heaps: Replace child formula with
child_t(i) = k*i + t (1≤t≤k)and adapt height arguments; heapify remainsO(n). -
Bottom-up construction for min-heap (priority queue with smallest at root): identical structure with inverted comparisons.
Applications
-
Heapsort: Call
HEAPIFY, then repeatedly swapA[0]withA[n-1], decrementn, andSIFT_DOWN(A, n, 0). TotalO(n log n)afterO(n)build. -
Priority queues: Fast initialization from an existing dataset (e.g., schedule by priority).
-
Selection/Top-k: Build a min-heap of size
k(or heapify the whole array) to filter or extract extremes efficiently.
Common Pitfalls or Edge Cases
Warning
Indexing confusion (0 vs 1-based). This note assumes 0-indexed arrays with
left=2i+1,right=2i+2, and the last internal node at⌊n/2⌋−1. In 1-indexed arrays:left=2i,right=2i+1, last internal node⌊n/2⌋.
Warning
Wrong start index. Starting heapify at
i = ⌊n/2⌋wastes a step (that node is a leaf). Start at⌊n/2⌋−1.
Warning
Assuming children exist without bounds checks. Always test
L<nandR<nbefore accessing; off-by-one bugs here corrupt the heap.
Warning
Mixing min-/max-heap comparisons. Be consistent across
SIFT_DOWNand later uses (e.g., heapsort expects a max-heap if it starts by swapping root with end).
Implementation Notes or Trade-offs
-
Stability: Binary heaps are not stable; relative order among equal keys is not preserved. If stability is required, carry a
(key, tiebreak)pair. -
Comparable keys: Define a total order (
</>) or a comparator. Inconsistent comparators violate heap invariants. -
Cache behavior: Arrays give good locality; the pull-up variant reduces swaps (writes) which can be beneficial for large data.
-
Partial heapify: For top-k with small
k, heapify ak-element heap and scan the rest with conditional inserts; overallO(n log k). -
Verification: After heapify, quickly check
A[i] ≥ A[left(i)]andA[i] ≥ A[right(i)](or the min-heap version) for all validi.
Summary
Bottom-up heapify builds a heap from an array in linear time by running sift-down from the last internal node to the root. The cost is linear because most nodes sit near the leaves and require only shallow adjustments. This method is the right default for initializing heaps, enabling O(n) build followed by O(log n) per extract in heapsort and priority-queue workloads.