Combining two metrics
Precision and recall pull in opposite directions, so it helps to summarize both with one number. The F1 score is the harmonic mean of precision and recall. It equals two times precision times recall divided by precision plus recall.
Why the harmonic mean
The harmonic mean punishes imbalance. If precision is high but recall is near zero, the F1 score stays low. A plain average would hide that failure, but the harmonic mean refuses to reward a model that ignores one side.
Adding a lean with F beta
Sometimes you care more about one metric. The F beta score introduces a weight called beta.
- A beta above one favors recall.
- A beta below one favors precision.
- Beta equal to one recovers the standard F1.
A common choice is F2, which weights recall twice as heavily as precision.
Key idea
The F1 score blends precision and recall with a harmonic mean that penalizes ignoring either one. The F beta score lets you tilt that blend toward whichever metric matters more.