The Threshold Tuning

Probabilities are not decisions

A classifier outputs a probability, but action needs a yes or no. The threshold converts probability into a label. The default of one half is rarely the best choice for a real problem.

What moving it does

Lowering the threshold labels more cases positive, raising recall and lowering precision.
Raising it does the reverse, fewer but surer positives.
The right point depends on the cost of a false positive versus a false negative.

How to choose

Sweep thresholds and plot precision against recall.
Pick the point that meets your operational target, such as a minimum recall.
The precision recall curve and the receiver operating characteristic curve summarize all thresholds at once.

Cautions

Choose the threshold on a validation set, not the test set.
Thresholds tuned on imbalanced data should track the rare class.
Calibrated probabilities make threshold choices more meaningful.

Key idea

The decision threshold, not the model, sets the precision recall balance. Tune it on validation data against the real costs of false positives and false negatives.

The Threshold Tuning

Probabilities are not decisions

What moving it does

How to choose

Cautions

Key idea

Check yourself