← Lessons

quiz vs the machine

Silver1120

Machine Learning

Naive Bayes Assumptions

Why pretending features are independent still works.

4 min read · intro · beat Silver to climb

Naive Bayes Assumptions

Naive Bayes is a classifier built on Bayes rule plus one bold shortcut. It assumes every feature is conditionally independent of the others given the class.

The independence assumption

Given the class label, naive Bayes treats each feature as contributing on its own. This lets it multiply per feature probabilities instead of modeling how features interact. The assumption is the naive part because real features often do depend on each other.

Why it still works

Even when independence is false, the classifier often predicts well because choosing the right class only needs the correct ordering of class scores, not exact probabilities.

  • It needs very little data because each feature is estimated separately.
  • Training is fast since there is no interaction to learn.
  • It handles many features gracefully, making it popular for text.

When it breaks

Strongly correlated features get their evidence counted multiple times, which can make the model overconfident. Despite this, the predicted class is frequently still correct, so naive Bayes remains a strong simple baseline.

Key idea

Naive Bayes assumes features are independent given the class, a wrong but useful shortcut that still classifies well.

Check yourself

Answer to earn rating on the learn ladder.

1. What independence does naive Bayes assume?

2. Why does naive Bayes classify well despite the wrong assumption?