ROC and AUC

The idea

Many classifiers output a score rather than a hard label. The ROC curve shows how that score separates classes as you sweep the decision threshold from strict to loose.

What the axes mean

The curve plots the true positive rate, also called recall, against the false positive rate. Each point corresponds to one threshold. A perfect model hugs the top left corner, catching all positives with no false alarms. A random model sits on the diagonal.

AUC

The area under the curve, or AUC, summarizes the whole ROC into one number between zero and one. It equals the probability that a randomly chosen positive gets a higher score than a randomly chosen negative. An AUC of one is perfect, and an AUC of half is random.

A caution

AUC is threshold independent and useful, but on heavily imbalanced data the precision recall curve can be more informative because ROC can look optimistic.

Key idea

The ROC curve traces recall against false positive rate across thresholds, and AUC condenses it into the probability that a positive outscores a negative.

The idea

What the axes mean

AUC

A caution

Key idea

Check yourself