Intuition

If you could repeat an experiment infinitely many times and average the results, you would get the expected value. It is the long-run average, the balance point of the distribution, the single number that summarizes where outcomes tend to land.

Expected value does not have to be a value the random variable can actually take. A fair die has - you will never roll a 3.5, but over thousands of rolls the average converges to it. This is the essence of the law of large numbers.

The power of expected value lies in its linearity: you can break complex quantities into simpler pieces, compute each expectation separately, and add them up - no independence required.

Definition

The expected value (or mean) of a random variable is the probability-weighted sum (or integral) of its possible values:

  • Discrete: takes values with probability mass function :

  • Continuous: has probability density function :

Note

The expected value may not exist if the sum or integral diverges (e.g., the Cauchy distribution has no mean).

Key Formulas

Expectation of a function

For any function :

This is used constantly - for instance, setting gives , which appears in the variance formula.

Linearity of expectation

For any random variables and and constants :

This holds regardless of whether and are independent. It is arguably the most useful property in all of probability.

Expectation of a product (independent case)

If and are independent:

This does not hold in general. When and are dependent, the difference is exactly the covariance.

Example

Expected sales commission. A salesperson is working three deals with the following payoffs and close probabilities:

DealCommission
A$5,0000.40
B$8,0000.25
C$2,0000.60

The expected commission from each deal: E_A = 5000 \times 0.40 = \2{,}000E_B = 8000 \times 0.25 = $2{,}000E_C = 2000 \times 0.60 = $1{,}200$.

By linearity (even if the deals are correlated):

Expected component lifespan. A component’s lifetime follows an exponential distribution with rate failures per hour. Its expected lifespan is:

This directly informs maintenance scheduling and spare-parts inventory.

Why It Matters in CS

  • Average-case algorithm analysis. The expected number of comparisons in randomized Quicksort is , derived using linearity of expectation over indicator random variables. See Best, Worst, and Average Cases.
  • Performance modeling. Expected response time, expected throughput, and expected queue length (via Little’s Law: ) are the bread and butter of systems performance engineering.
  • Network analysis. Expected packet delay, expected number of retransmissions, and expected path latency guide protocol design and capacity planning.
  • Machine learning. Loss functions are expectations: . Training minimizes empirical expected loss; generalization theory bounds the true expected loss.