← Lessons

quiz vs the machine

Silver1050

Machine Learning

The Confusion Matrix

The four count table that every classification metric is built on.

4 min read · intro · beat Silver to climb

What it tabulates

A confusion matrix is a small table that compares predicted labels against true labels. For a binary problem it has four cells that count every prediction.

  • True positives are positives the model correctly flagged.
  • True negatives are negatives it correctly left alone.
  • False positives are negatives wrongly flagged, also called type one errors.
  • False negatives are positives the model missed, also called type two errors.

Why it matters

Accuracy alone hides where mistakes happen. The matrix shows the shape of the errors, which is vital when one error type is far costlier than the other. A cancer screen that misses sick patients fails differently from one that over flags healthy ones.

Reading it

Most metrics are ratios of these four counts. Precision divides true positives by all predicted positives. Recall divides true positives by all actual positives. Looking at the raw cells first keeps you honest before you trust any single summary number.

Key idea

The confusion matrix counts true and false positives and negatives, giving the raw material from which precision, recall, and most other classification metrics are computed.

Check yourself

Answer to earn rating on the learn ladder.

1. What is a false negative?

2. Why prefer the matrix over accuracy alone?