More data for free
Models generalize better with more varied data, but collecting and labeling it is expensive. Data augmentation creates new training examples by applying label preserving transformations to the ones you already have, effectively enlarging the dataset.
Common transformations
The trick is to change the input in ways that do not change its label:
- For images, flip, rotate, crop, recolor, or add noise, since a flipped cat is still a cat
- For text, swap synonyms, back translate through another language, or delete random words
- For audio, shift pitch, stretch time, or mix in background sound
By seeing many variations, the model learns features that are invariant to these changes rather than memorizing exact pixels or words.
Cautions
Augmentations must respect the task. Flipping a photo of a digit can turn a six into something wrong, and overly aggressive noise can destroy the signal. Strong modern recipes like mixup blend whole examples together.
Key idea
Data augmentation applies label preserving transforms to expand and diversify training data, teaching the model invariances and reducing overfitting.