← Lessons

quiz vs the machine

Silver1050

Machine Learning

The Full Fine Tuning

Updating every weight of a pretrained model to adapt it to a new task.

4 min read · intro · beat Silver to climb

Starting from a pretrained model

A model trained on a huge general corpus already knows a great deal about language or images. Full fine tuning continues training that model on a smaller task specific dataset, updating every parameter so the model specializes.

Why it works

  • The pretrained weights give a strong starting point, so far less data is needed than training from scratch.
  • A smaller learning rate is used so the model adjusts rather than forgets.
  • Gradients flow through the whole network, letting all layers adapt.

The training flow

The cost

Full fine tuning produces a complete new copy of the model. For large models this is expensive in memory and storage, since the optimizer must hold gradients and states for every weight, and each task needs its own full checkpoint. These costs motivate the parameter efficient methods that follow.

Key idea

Full fine tuning updates all weights of a pretrained model on task data, giving strong adaptation at the cost of a full model copy and heavy memory use per task.

Check yourself

Answer to earn rating on the learn ladder.

1. What does full fine tuning update?

2. Why use a smaller learning rate when fine tuning?