Normal Distribution

Intuition

The normal distribution is the symmetric bell curve that shows up whenever many small, independent effects add together. Heights, measurement errors, exam scores - all tend to cluster around a central value with symmetric tails. The curve is entirely described by two numbers: where it is centered and how wide it spreads. This simplicity, combined with the Central Limit Theorem, makes it the single most important distribution in statistics.

Definition

A continuous random variable $X$ follows a normal distribution with mean $μ$ and standard deviation $σ$ (written $X \sim N (μ, σ^{2})$ ) if its probability density function is:

$f (x) = \frac{1}{σ 2 π} exp (- \frac{( x - μ ) ^{2}}{2 σ ^{2}}), - \infty < x < \infty$

$μ$ controls the center (location) of the bell.
$σ$ controls the width (spread); larger $σ$ means flatter and wider.
The distribution is symmetric about $μ$ , so the mean, median, and mode coincide.

The standard normal distribution is the special case $Z \sim N (0, 1)$ . Its CDF is denoted $Φ (z)$ and serves as the universal reference for all normal probabilities.

Key Formulas

Standardizing transformation - convert any normal variable to standard normal:

$Z = \frac{X - μ}{σ}$

This lets you look up probabilities in a single $Z$ -table or use a single CDF $Φ (z)$ .

The 68-95-99.7 rule (empirical rule) - worth memorizing:

Interval	Probability
$μ \pm 1 σ$	$\approx 68.3%$
$μ \pm 2 σ$	$\approx 95.4%$
$μ \pm 3 σ$	$\approx 99.7%$

Bell curve showing 68-95-99.7 rule with shaded sigma regions

Moment-generating function:

$M_{X} (t) = exp (μ t + \frac{σ ^{2} t ^{2}}{2})$

Linear combinations: If $X_{i} \sim N (μ_{i}, σ_{i}^{2})$ are independent, then $\sum a_{i} X_{i} \sim N (\sum a_{i} μ_{i}, \sum a_{i}^{2} σ_{i}^{2})$ .

Example

Manufacturing tolerances. A machine produces ball bearings whose diameters follow $X \sim N (10.00, 0.0 2^{2})$ mm. Bearings outside the tolerance $10.00 \pm 0.05$ mm are scrapped. What proportion is scrap?

Standardize the upper bound:

$Z = \frac{10.05 - 10.00}{0.02} = 2.5$

By symmetry, the proportion outside tolerance is:

$P (∣ X - 10∣ > 0.05) = 2 [1 - Φ (2.5)] = 2 (0.0062) \approx 1.24%$

So about 1.24% of production is scrapped - a number that directly informs cost analysis and quality control decisions.

If the process variance drifted to $σ = 0.03$ mm, the scrap rate would jump to $2 [1 - Φ (1.67)] \approx 9.5%$ - demonstrating how sensitive quality is to the spread parameter.

Why It Matters in CS

The 68-95-99.7 rule is burned into every engineer’s brain for a reason: it lets you eyeball whether data is behaving normally without running a formal test. If roughly 5% of your values fall outside two standard deviations, things are probably fine. If 20% do, something interesting is going on.

In ML, the normal shows up constantly. Weight initialization in neural networks samples from $N (0, σ^{2})$ because symmetric, light-tailed starting points help gradient flow. Gaussian Mixture Models are just “what if the data came from $k$ overlapping bell curves?” Variational autoencoders and diffusion models both lean on the normal as a latent prior because it’s easy to sample from and has nice analytic properties.

Note

OLS regression assumes $ε \sim N (0, σ^{2})$ , which is what justifies $t$ -tests on coefficients. If the residuals aren’t roughly normal, those p-values you’re reading off the regression output may not mean much.

Central Limit Theorem - explains why the normal distribution appears so often
Probability Distributions - the normal in context with other distribution families
Regression Fundamentals - normality assumption on residuals
Bayesian Inference - normal priors and conjugate updating

Cam's Cyberspace

Recent Notes

Algorithm Efficiency - Bridging Theory and Practice

Home

Best, Worst & Average Cases

Explorer

Normal Distribution

Intuition

Definition

Key Formulas

Example

Why It Matters in CS

Graph View

Table of Contents

Backlinks

Cam's Cyberspace

Recent Notes

Algorithm Efficiency - Bridging Theory and Practice

Home

Best, Worst & Average Cases

Explorer

Normal Distribution

Intuition

Definition

Key Formulas

Example

Why It Matters in CS

Related Notes

Graph View

Table of Contents

Backlinks