The Bernoulli and Binomial
Many events have just two outcomes: success or failure, click or no click, heads or tails. Two related distributions model these.
The Bernoulli distribution
A Bernoulli distribution describes a single trial with probability p of success and 1 minus p of failure. Its mean is p and its variance is p times one minus p, which is largest when p equals one half, the point of maximum uncertainty.
The binomial distribution
A binomial distribution counts the number of successes in n independent Bernoulli trials that share the same p. Its mean is n times p and its variance is n times p times one minus p.
- The trials must be independent.
- The success probability must stay constant across trials.
For example, the number of heads in ten fair coin flips is binomial with n equal to ten and p equal to one half.
Why it matters
These distributions underpin binary classification. A model that outputs a probability of the positive class is effectively a Bernoulli model per example, and the binomial governs how many positives you expect in a batch.
Key idea
The Bernoulli models one binary trial with success probability p, and the binomial counts successes across n independent identical trials with mean n times p.