Intuition

Some events happen at a roughly constant average rate: server requests per second, typos per page, radioactive decays per minute. The Poisson distribution counts how many such events occur in a fixed interval of time or space. It works best when events are independent, arrive one at a time, and the average rate does not change across the interval.

Definition

A random variable follows a Poisson distribution if it represents the number of events occurring in a fixed interval (time, area, volume, etc.) where events happen independently at a constant average rate.

can take non-negative integer values

The parameter is the expected number of events in the interval of length with rate .

Three assumptions underlie the Poisson model:

  1. Independence: the occurrence of one event does not affect the probability of another.
  2. Proportionality: for a small sub-interval , .
  3. No simultaneous events: .

Key Formulas

Probability Mass Function (PMF):

Mean:

Variance:

A distinctive (and useful) feature: the mean and variance are equal. This gives you a quick diagnostic - if your observed variance is much larger than the mean, a Poisson model probably doesn’t fit (overdispersion).

Standard Deviation:

Moment Generating Function:

Poisson approximation to the binomial:

When is large and is small, . A common rule of thumb is and .

Example

Aircraft arrive at an airport at an average rate of 2 per hour (). What is the probability that exactly 5 arrive in a 2-hour window?

Here .

There is roughly a 15.6% chance of exactly 5 arrivals in the 2-hour period.

We can also find the probability of no arrivals in a 30-minute window ():

So there is about a 37% chance of a quiet half-hour with no aircraft arrivals.

As a binomial approximation: if 1000 components each fail independently with probability , the expected number of failures is . Rather than computing , we use .

Why It Matters in CS

The entire field of queueing theory starts with “assume arrivals are Poisson.” The M/M/1 queue, the M/M/c queue, basically every tractable queueing model uses this as the arrival process because it makes the math work out to closed-form solutions for wait times and queue lengths. When you size a buffer on a router or set autoscaling thresholds on a web server, there’s a good chance a Poisson assumption is somewhere in the analysis.

Warning

Real network traffic is often bursty, which violates the independence assumption and produces overdispersion (variance >> mean). The Poisson is a starting point, not gospel. If your observed variance is three times your mean, you need a different model.

The mean-equals-variance property also makes Poisson a natural baseline for anomaly detection. If your error logs normally see errors per minute, observing 15 in a single minute is a clear signal - you can compute exactly how unlikely that is without fitting anything more complicated.