What it is
Naive Bayes is a probabilistic classifier built on Bayes theorem. It estimates the probability of each class given the features, then picks the most likely class.
The naive assumption
The key shortcut is assuming features are conditionally independent given the class. This is usually false in reality, yet the classifier often works well anyway.
- Independence lets us multiply per feature probabilities instead of modeling joint behavior.
- This makes training a matter of counting, which is extremely fast.
- It handles many features gracefully, which is why it shines on text.
Training and prediction
- Estimate the prior for each class from label frequencies.
- Estimate the likelihood of each feature value within each class.
- Combine them with Bayes rule and choose the highest score.
Practical notes
- Use Laplace smoothing so an unseen feature value does not zero out a whole class.
- Work in log space to avoid multiplying many tiny probabilities.
- It gives a strong, cheap baseline for spam and topic classification.
Key idea
Naive Bayes assumes features are conditionally independent given the class, turning classification into fast counting that works surprisingly well on text.