The core idea
Transfer learning reuses a model trained on one large task as a starting point for a different, often smaller task. Instead of learning from scratch, you inherit useful features.
Why it works
A model pretrained on huge amounts of text or images learns general patterns. Lower layers capture broad features like edges or grammar, which stay useful across tasks.
- Pretraining happens once on a large general dataset
- The learned weights become a reusable starting point
- A new task reuses these weights and adapts them
Two common modes
- Feature extraction freezes the pretrained model and trains only a small new head on top
- Fine tuning unfreezes some or all layers and updates them on the new data
Transfer learning shines when your target dataset is small, since the pretrained backbone already supplies most of the knowledge and you only need a little task specific adjustment.
Key idea
Transfer learning starts from a pretrained model and adapts it, letting small datasets benefit from knowledge learned on large ones.