Intuition
Some events happen at a roughly constant average rate: server requests per second, typos per page, radioactive decays per minute. The Poisson distribution counts how many such events occur in a fixed interval of time or space. It works best when events are independent, arrive one at a time, and the average rate does not change across the interval.
Definition
A random variable follows a Poisson distribution if it represents the number of events occurring in a fixed interval (time, area, volume, etc.) where events happen independently at a constant average rate.
can take non-negative integer values
The parameter is the expected number of events in the interval of length with rate .
Three assumptions underlie the Poisson model:
- Independence: the occurrence of one event does not affect the probability of another.
- Proportionality: for a small sub-interval , .
- No simultaneous events: .
Key Formulas
Probability Mass Function (PMF):
Mean:
Variance:
A distinctive (and useful) feature: the mean and variance are equal. This gives you a quick diagnostic - if your observed variance is much larger than the mean, a Poisson model probably doesn’t fit (overdispersion).
Standard Deviation:
Moment Generating Function:
Poisson approximation to the binomial:
When is large and is small, . A common rule of thumb is and .
Example
Aircraft arrive at an airport at an average rate of 2 per hour (). What is the probability that exactly 5 arrive in a 2-hour window?
Here .
There is roughly a 15.6% chance of exactly 5 arrivals in the 2-hour period.
We can also find the probability of no arrivals in a 30-minute window ():
So there is about a 37% chance of a quiet half-hour with no aircraft arrivals.
As a binomial approximation: if 1000 components each fail independently with probability , the expected number of failures is . Rather than computing , we use .
Why It Matters in CS
The entire field of queueing theory starts with “assume arrivals are Poisson.” The M/M/1 queue, the M/M/c queue, basically every tractable queueing model uses this as the arrival process because it makes the math work out to closed-form solutions for wait times and queue lengths. When you size a buffer on a router or set autoscaling thresholds on a web server, there’s a good chance a Poisson assumption is somewhere in the analysis.
Warning
Real network traffic is often bursty, which violates the independence assumption and produces overdispersion (variance >> mean). The Poisson is a starting point, not gospel. If your observed variance is three times your mean, you need a different model.
The mean-equals-variance property also makes Poisson a natural baseline for anomaly detection. If your error logs normally see errors per minute, observing 15 in a single minute is a clear signal - you can compute exactly how unlikely that is without fitting anything more complicated.
Related Notes
- Exponential Distribution - models the continuous time between Poisson events
- Binomial Distribution - the Poisson approximates the binomial for large and small
- Probability Distributions - broader taxonomy of discrete and continuous distributions