← Lessons

quiz vs the machine

Platinum1750

Machine Learning

The Bias Variance Decomposition

Split expected error into bias, variance, and irreducible noise.

6 min read · advanced · beat Platinum to climb

The decomposition

For squared error, a model expected test error splits cleanly into three parts.

  • Bias squared measures error from wrong assumptions, how far the average prediction sits from the truth.
  • Variance measures how much predictions wobble across different training sets.
  • Irreducible noise is the randomness in the data that no model can remove.

Reading the parts

  • A model that is too simple has high bias and underfits.
  • A model that is too flexible has high variance and overfits.
  • The noise floor sets the best error any model could reach.

The tradeoff

Lowering one term often raises the other. Adding capacity cuts bias but raises variance. Adding regularization cuts variance but raises bias. The sweet spot minimizes their sum.

Levers in practice

  • Regularization and simpler models lower variance.
  • More features or capacity lower bias.
  • More data mainly lowers variance, letting you afford more capacity.

Key idea

Expected error decomposes into bias squared, variance, and irreducible noise, and good models minimize the sum of the first two.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the variance term capture?

2. What does the irreducible noise represent?

3. What typically happens when you add model capacity?