Best, Worst & Average Cases

Definition

“Best”, “worst”, and “average” cases classify an algorithm’s running time or space usage over families of inputs of the same size (n).

Worst case: An upper bound over all inputs of size (n). Guarantees performance even under adversarial conditions.
Best case: A lower bound achieved by the most favorable inputs of size (n). Rarely used for guarantees; helpful for understanding adaptability.
Average case: The expected cost with respect to a probability distribution over inputs of size (n). Requires explicitly stating the distributional assumptions.

Mathematically, for a cost function (T(x)) on inputs (x) with (|x|=n):

Worst case: (\max_{\lvert x\rvert=n} T(x))
Best case: (\min_{\lvert x\rvert=n} T(x))
Average case (expectation): (\mathbb{E}[T(X_n)]) given a distribution on (X_n).

What these bounds mean

Worst-case bounds are guarantees.

Average-case bounds are expectations under stated assumptions.

Best-case bounds are possibilities, not promises.

Why it matters

Engineers need to know how performance degrades when inputs are unfavorable, and what to expect on typical data.

Systems with tight SLOs prioritize worst-case guarantees (e.g., real-time schedulers, safety-critical code).
Data-processing apps often rely on average-case behavior when inputs are well-modeled (e.g., randomized quicksort on shuffled data).
Best-case behavior highlights adaptability (e.g., insertion sort is fast on nearly-sorted data).

Practical takeaway

Choose worst case when you must not exceed a latency/space budget; choose average case when input distributions are stable and can be defended.

Model & Assumptions

A statement like “average (O(f(n)))” is meaningless without:

A distribution over inputs (e.g., uniform over all permutations, random hash function, i.i.d. keys).
A cost model (e.g., RAM with unit-cost arithmetic, how comparisons and cache misses are counted).
Clear preconditions (e.g., “array is randomly shuffled”, or “hash function is universal/independent”).

Be explicit about assumptions

If you randomize the algorithm (e.g., randomized pivot), you can analyze expected runtime over the algorithm’s coins for any fixed input—this sidesteps unknown input distributions.

If you rely on input randomness, document the source of randomness (real-world shuffles? adversary likely?).

Examples

Quicksort

Worst: (O(n^2)) when pivots are consistently extreme (e.g., already-sorted data with deterministic first-element pivot).
Average (random pivot or random permutation): (O(n \log n)).
Best: (O(n \log n)) in many instances; not asymptotically better than average.

Recursion shapes

Worst case: highly unbalanced recursion (depth (n)).

Average case: roughly balanced recursion (depth (\Theta(\log n))).

Hash table lookups (chaining)

Worst: (O(n)) if all keys collide in one bucket (adversarial or poor hash).
Average: (O(1)) expected under simple uniform hashing with bounded load factor.
Best: (O(1)) when the key is in a short bucket or not present with immediate negative test.

Binary search

Worst: (O(\log n)) over any input because the search path length is bounded.
Average: (O(\log n)); distribution of target locations does not change the asymptotic bound.
Best: (O(1)) if the first probe hits the target (degenerate but possible).

Insertion sort

Worst: (O(n^2)) comparisons/shifts on reverse-sorted arrays.
Average: (O(n^2)) over random permutations, but with a smaller constant than selection sort.
Best: (O(n)) on already sorted arrays (adaptive behavior).

Mini trace: insertion sort on nearly-sorted data

Each insertion shifts only a few elements, so total shifts scale with number of inversions, not (n^2).

Graph algorithms (BFS/DFS)

Worst/Average/Best: (O(n+m)) for traversal itself—insensitive to input ordering, but graph density (m) vs (n) heavily influences actual running time.

Model & Assumptions (worked patterns)

Uniform over inputs: average over all permutations/graphs/etc. Often unrealistic without justification.
Stochastic process for inputs: e.g., keys are i.i.d. from a distribution (D) (state (D) explicitly).
Randomized algorithms: expected runtime is over the algorithm’s own randomness for any input (Yao’s principle relates this to distributional inputs).

Concrete examples:

Randomized quicksort: expectation over random pivot choices; average (O(n\log n)) for every fixed input.
Cuckoo hashing: expected (O(1)) lookup under specific independence assumptions on hash functions.

Properties & Relationships

Worst-case dominates: For asymptotic guarantees, worst-case bounds are sufficient (but sometimes pessimistic).
Average-case depends on modeling: Changing the input or hash distribution can change the bound.
Best-case rarely drives design: It’s a sanity check or highlights adaptability (e.g., partially sorted inputs).

Connections:

Asymptotic Notation formalizes (O,\Omega,\Theta).
Amortized Analysis differs: averages across operations on a structure, not across input instances.
Time Complexity — Analysis and Time Complexity — Calculation provide techniques (summations, recurrences) to derive these bounds.

Heuristics & Design Guidance

If the worst case matters, prefer algorithms with strong worst-case guarantees (e.g., heapsort (O(n\log n)) vs quicksort’s (O(n^2))).
If inputs are well-mixed and latency spikes are tolerable, exploit average-case winners (randomized quicksort, hash tables).
Use randomization to defend against adversarial inputs (e.g., randomized pivoting, randomized hashing).
Detect structure at runtime: switch to insertion sort on tiny subarrays or nearly sorted data; use introsort (quicksort → heapsort fallback) to combine average-case speed with worst-case safety.

Common Pitfalls

Conflating "average" with "expected"

“Average case” is an expectation with respect to a specified distribution. If the real inputs don’t match that distribution, the claim may not hold.

Ignoring adversaries

Deterministic choices (first/last-element pivot) are exploitable; an attacker can force worst-case (O(n^2)) in quicksort or (O(n)) in hash tables.

Misusing best-case

Best-case bounds do not imply typical fast performance; they only show what’s possible, not what’s likely.

Prefer transparent assumptions

State the input model (uniform permutations, i.i.d. draws, or algorithmic randomization). If in doubt, prove a worst-case bound and measure empirical averages.

Practical Context

Libraries: Many standard libraries use introsort (quicksort with median-of-three + heapsort fallback) to mitigate worst-case while keeping average speed.
Systems: Hash tables rely on load-factor control and good hash functions (sometimes randomized) to ensure expected (O(1)).
Data skew: Real-world keys are rarely uniform; evaluate with profiles and benchmarks that mirror production distributions.

Summary

Worst case: guarantees performance under any input; may be pessimistic.
Average case: expectations under stated distributions; powerful but model-sensitive.
Best case: demonstrates adaptability; not a guarantee.
Use randomization and fallbacks to combine strong typical performance with safety against adversaries.

Cam's Cyberspace

Recent Notes

Home

Linked Lists — Searching

Selection Sort

Explorer

Best, Worst & Average Cases

Definition

Why it matters

Model & Assumptions

Examples

Quicksort

Hash table lookups (chaining)

Binary search

Insertion sort

Graph algorithms (BFS/DFS)

Model & Assumptions (worked patterns)

Properties & Relationships

Heuristics & Design Guidance

Common Pitfalls

Practical Context

Summary

See also

Graph View

Table of Contents

Backlinks