Naive Bayes

A fast probabilistic classifier that assumes features are conditionally independent.

What it is

Naive Bayes is a probabilistic classifier built on Bayes theorem. It estimates the probability of each class given the features, then picks the most likely class.

The naive assumption

The key shortcut is assuming features are conditionally independent given the class. This is usually false in reality, yet the classifier often works well anyway.

Independence lets us multiply per feature probabilities instead of modeling joint behavior.
This makes training a matter of counting, which is extremely fast.
It handles many features gracefully, which is why it shines on text.

Training and prediction

Estimate the prior for each class from label frequencies.
Estimate the likelihood of each feature value within each class.
Combine them with Bayes rule and choose the highest score.

Practical notes

Use Laplace smoothing so an unseen feature value does not zero out a whole class.
Work in log space to avoid multiplying many tiny probabilities.
It gives a strong, cheap baseline for spam and topic classification.

Key idea

Naive Bayes assumes features are conditionally independent given the class, turning classification into fast counting that works surprisingly well on text.

What it is

The naive assumption

Training and prediction

Practical notes

Key idea

Check yourself