More data for free
Augmentation creates new training images by transforming existing ones in ways that keep the label valid. A flipped cat is still a cat. This enlarges effective data and teaches invariances without new labels.
Common transforms
- Geometric changes like flips, rotations, crops, and scaling.
- Photometric changes like brightness, contrast, and color jitter.
- Occlusion tricks like random erasing that hide patches.
Mixing samples
Stronger methods combine images. Mixup blends two images and their labels, while CutMix pastes a patch from one image into another and mixes labels by area. These push the model toward smoother decision boundaries.
Respect the label
The cardinal rule is that a transform must not change the correct answer. Horizontal flip is fine for natural scenes but wrong for reading digits or text where left and right carry meaning. Choose augmentations that match the task semantics.
When to dial it back
Very aggressive augmentation can slow convergence or distort rare classes. Tune strength as a hyperparameter and watch validation accuracy.
Key idea
Augmentation builds label preserving variants through geometric, photometric, occlusion, and mixing transforms, raising robustness as long as each transform keeps the label correct.