← Lessons

quiz vs the machine

Platinum1780

Machine Learning

The Test Time Augmentation

Averaging predictions over augmented copies of each test input for a free accuracy bump.

4 min read · advanced · beat Platinum to climb

Augmenting at inference

Augmentation is usually a training trick, but test time augmentation applies it at inference too. You create several augmented versions of a test input, predict on each, and average the results. It is like ensembling one model over many views.

The procedure

Why it helps

  • Each view exposes different cues, and averaging smooths out idiosyncratic errors on any single view.
  • It tends to improve both accuracy and calibration, much like a small ensemble.
  • It needs no extra training, only more inference compute.

Choosing transforms

  • Use only label preserving transforms, the same rule as training augmentation.
  • Mild geometric and color changes work best; extreme distortions can hurt.
  • Average probabilities rather than hard votes to keep the signal smooth.

Practical notes

  • The cost is proportional to the number of views, so balance gain against latency.
  • It pairs well with model ensembles for competition grade accuracy.

Key idea

Test time augmentation predicts on several label preserving views of each input and averages them, acting as a one model ensemble. It lifts accuracy and calibration at the cost of extra inference compute, with no retraining.

Check yourself

Answer to earn rating on the learn ladder.

1. What does test time augmentation do?

2. What is the main cost of test time augmentation?