The problem
In many real tasks one class is rare. Fraud, disease, and defects might appear in well under one percent of examples. A model can score high accuracy by always predicting the majority class while completely failing on the cases that matter.
Why accuracy lies
With a ninety nine to one split, predicting the majority every time gives ninety nine percent accuracy and zero useful detection. You must look at precision, recall, and the F1 score for the minority class instead.
Data level fixes
- Oversampling repeats or synthesizes minority examples, as in SMOTE
- Undersampling drops some majority examples
- Class balanced sampling draws batches so classes appear more evenly
Loss level fixes
- Class weights make minority mistakes cost more in the loss
- Focal loss down weights easy examples so the model focuses on hard minority cases
Threshold and evaluation
The default decision threshold of one half is rarely optimal under imbalance. Tuning the threshold on a validation set, and reporting the precision recall curve, gives a far truer picture than raw accuracy. Always evaluate with metrics that reflect the cost of missing the rare class.
Key idea
With imbalance, accuracy misleads, so combine resampling or reweighting with threshold tuning and minority focused metrics.