Gaussian Naive Bayes
Plain naive Bayes counts categorical events, but many features are continuous. Gaussian naive Bayes handles numeric features by modeling each one with a normal distribution per class.
What it estimates
For every feature and every class the model computes two numbers from the training data.
- The mean of that feature among samples of that class.
- The variance of that feature within that class.
These define a bell curve that gives the likelihood of any feature value under each class.
Making a prediction
To classify a new point the model multiplies the class prior by each feature's bell curve likelihood, still assuming independence. The class with the largest product wins. In practice the math uses logarithms to add instead of multiply, which avoids tiny numbers underflowing to zero.
Strengths and limits
Gaussian naive Bayes is fast and needs only a few statistics per feature, so it trains on small data. Its weakness is the shape assumption, since a feature that is skewed or multimodal is poorly described by a single bell curve. Transforming such features toward normality can help.
Key idea
Gaussian naive Bayes models each numeric feature as a per class bell curve defined by its mean and variance.