The imbalance problem
When one class dominates, a model can score high accuracy by always predicting the majority while ignoring the rare class entirely. Class weighting counters this by scaling the loss so mistakes on rare classes cost more.
How it works
- Assign each class a weight, often inversely proportional to its frequency, so a rare class contributes more per example.
- The optimizer then cares more about getting rare examples right, shifting the learned boundary toward them.
- Unlike resampling, it touches the loss rather than the data, so no rows are duplicated or dropped.
Weighting versus resampling
- Resampling changes the data the model sees by oversampling minorities or undersampling majorities.
- Weighting keeps all data and changes how much each example counts. It avoids duplicate overfitting from oversampling and avoids discarding data from undersampling.
Cautions
- Weighting reshapes the predicted probabilities, so they may need calibration before being read as true likelihoods.
- The right metric matters. Tune weights against a metric that respects the rare class, such as recall or area under the precision recall curve, not raw accuracy.
Key idea
Class weighting scales the loss so rare class errors cost more, fixing imbalance without resampling, but it distorts probabilities and needs the right evaluation metric.