← Lessons

quiz vs the machine

Silver1170

Machine Learning

The F1 Score

Combining precision and recall into one balanced number with a harmonic mean.

3 min read · intro · beat Silver to climb

Why combine two metrics

Precision and recall each tell only half the story, and they trade off against each other. Reporting both is honest but awkward when you need to rank models or tune a threshold. The F1 score folds them into a single value.

The harmonic mean

F1 is the harmonic mean of precision and recall, not the ordinary average. The harmonic mean is dominated by the smaller of the two numbers:

  • If precision is high but recall is near zero, F1 stays near zero
  • F1 is only high when both precision and recall are high
  • This punishes models that cheat one metric while ignoring the other

Variants

The general F beta score weights recall more heavily when beta is above one, useful when missing positives is costly, and weights precision more when beta is below one. On imbalanced data F1 is far more informative than plain accuracy.

Key idea

The F1 score is the harmonic mean of precision and recall, rewarding models only when both are strong, with F beta tilting the balance toward one or the other.

Check yourself

Answer to earn rating on the learn ladder.

1. Why use the harmonic mean for F1?

2. Raising beta above one in F beta does what?