The Bernoulli and Binomial

Many events have just two outcomes: success or failure, click or no click, heads or tails. Two related distributions model these.

The Bernoulli distribution

A Bernoulli distribution describes a single trial with probability p of success and 1 minus p of failure. Its mean is p and its variance is p times one minus p, which is largest when p equals one half, the point of maximum uncertainty.

The binomial distribution

A binomial distribution counts the number of successes in n independent Bernoulli trials that share the same p. Its mean is n times p and its variance is n times p times one minus p.

The trials must be independent.
The success probability must stay constant across trials.

For example, the number of heads in ten fair coin flips is binomial with n equal to ten and p equal to one half.

Why it matters

These distributions underpin binary classification. A model that outputs a probability of the positive class is effectively a Bernoulli model per example, and the binomial governs how many positives you expect in a batch.

Key idea

The Bernoulli models one binary trial with success probability p, and the binomial counts successes across n independent identical trials with mean n times p.

The Bernoulli and Binomial