Intuition

Take any population - skewed, bimodal, uniform, it doesn’t matter - and repeatedly draw random samples of size . Compute the sample mean each time. As grows, those sample means form a distribution that looks increasingly normal, regardless of what the original population looked like. This is the Central Limit Theorem (CLT), and it is the reason the normal distribution dominates statistics: even when individual data aren’t Gaussian, averages of enough data points are.

Definition

Let be independent and identically distributed (i.i.d.) random variables with mean and finite variance . The CLT states that as :

where is the sample mean and denotes convergence in distribution.

In practice, the approximation is considered reliable when , though the threshold depends on how non-normal the underlying distribution is. Highly skewed populations may need larger .

Key Formulas

Standard error of the mean:

The standard error shrinks as - quadrupling the sample size halves the standard error.

Standardized test statistic:

When is unknown and estimated by the sample standard deviation , use the -distribution instead:

Sum version: The CLT also applies to sums. If , then:

Tip

The CLT explains why many test statistics (z-tests, t-tests) and confidence intervals rely on the normal distribution - even when the raw data are not normal.

Example

Resistor quality control. A factory produces resistors with mean resistance and standard deviation . The individual resistance distribution is right-skewed (not normal). A quality inspector samples resistors and measures the average.

By the CLT, is approximately normal:

What is the probability the sample average exceeds ?

About 6.7% - even though individual resistances are skewed, the CLT lets us use normal probability calculations on the sample mean.

Notice that increasing the sample to would tighten the standard error to , making the same deviation more significant (, ). The CLT quantifies exactly how more data sharpens inference.

Why It Matters in CS

  • Monte Carlo simulation: averaging many random simulation runs yields normally distributed estimates, enabling confidence intervals on the result.
  • Algorithm analysis: when benchmarking runtime over many random inputs, the mean runtime is approximately normal, justifying Gaussian-based statistical tests for performance comparisons.
  • Large-scale data: in big-data pipelines, aggregate statistics (means, counts per partition) behave normally, which simplifies anomaly detection and threshold setting.
  • A/B testing: conversion rate differences across thousands of users are approximately normal, which is why z-tests power most A/B testing frameworks.