The curve
The ROC curve plots the true positive rate against the false positive rate as the threshold sweeps from strict to loose. Each threshold is one point on the curve.
What the area means
AUC is the area under that curve, between 0 and 1.
- 0.5 is random guessing, the diagonal line
- 1.0 is perfect separation
- A clean probabilistic reading: AUC is the chance a random positive scores higher than a random negative
This makes AUC threshold independent. It judges ranking quality, not a specific operating point.
Strengths
- Invariant to the chosen threshold
- Invariant to class prior, so it does not shift when prevalence changes
The catch
That same prior invariance is a weakness on heavy imbalance. The false positive rate has a huge negative denominator, so many false positives barely move it. AUC can look strong while precision is dismal.
Key idea
ROC AUC measures ranking quality independent of threshold and prevalence. On severe imbalance prefer PR AUC, because ROC AUC can flatter a model.