← Lessons

quiz vs the machine

Gold1450

Machine Learning

The Beta Binomial Conjugate Prior

A tidy prior that updates with simple counting.

4 min read · core · beat Gold to climb

The Beta Binomial Conjugate Prior

When estimating a probability such as a coin's bias, the beta distribution is a natural prior. Paired with binomial data it forms a conjugate pair, meaning the posterior is again a beta distribution.

What conjugacy buys

Conjugacy means the posterior has the same form as the prior, just with updated numbers. This avoids the hard normalizing integral, so updating becomes plain arithmetic.

The update rule

A beta prior has two shape parameters often called alpha and beta, which act like prior counts of successes and failures.

  • Observe the data as a number of successes and failures.
  • Add the successes to alpha and the failures to beta.
  • The result is the posterior beta distribution.

So if a prior of alpha two and beta two sees seven successes and three failures, the posterior is alpha nine and beta five.

Pseudo counts as a prior

The prior parameters act as pseudo counts, imaginary observations seen before the real data. A prior of one and one is uniform, expressing no preference, while larger values express stronger prior belief that pulls estimates toward the prior mean.

Why it matters

The beta binomial pair makes Bayesian updating intuitive and cheap, which is why it appears in A B testing, click rate estimation, and many online learning systems.

Key idea

A beta prior is conjugate to binomial data, so updating just adds observed successes and failures to its parameters.

Check yourself

Answer to earn rating on the learn ladder.

1. What does conjugacy mean for the beta binomial pair?

2. How do you update a beta prior after observing data?

3. What do the prior parameters alpha and beta act like?