← Lessons

quiz vs the machine

Silver1120

Machine Learning

The Image Augmentation Strategies

Expanding data with label preserving transforms to fight overfitting.

4 min read · intro · beat Silver to climb

More data for free

Augmentation creates new training images by transforming existing ones in ways that keep the label valid. A flipped cat is still a cat. This enlarges effective data and teaches invariances without new labels.

Common transforms

  • Geometric changes like flips, rotations, crops, and scaling.
  • Photometric changes like brightness, contrast, and color jitter.
  • Occlusion tricks like random erasing that hide patches.

Mixing samples

Stronger methods combine images. Mixup blends two images and their labels, while CutMix pastes a patch from one image into another and mixes labels by area. These push the model toward smoother decision boundaries.

Respect the label

The cardinal rule is that a transform must not change the correct answer. Horizontal flip is fine for natural scenes but wrong for reading digits or text where left and right carry meaning. Choose augmentations that match the task semantics.

When to dial it back

Very aggressive augmentation can slow convergence or distort rare classes. Tune strength as a hyperparameter and watch validation accuracy.

Key idea

Augmentation builds label preserving variants through geometric, photometric, occlusion, and mixing transforms, raising robustness as long as each transform keeps the label correct.

Check yourself

Answer to earn rating on the learn ladder.

1. Why can horizontal flip be a bad augmentation for digit reading?

2. What does CutMix do?

3. What risk comes with very aggressive augmentation?