← Lessons

quiz vs the machine

Gold1410

Machine Learning

The Catastrophic Forgetting

Why fine tuning on new data can erase old capabilities.

5 min read · core · beat Gold to climb

Losing what was learned

When a model is fine tuned on a new task, its weights shift to fit the new data. If that shift is too large, the model can lose abilities it had before. This is catastrophic forgetting: gaining a new skill at the cost of old ones.

Why it happens

  • Weights are shared across tasks, so adjusting them for new data overwrites old patterns.
  • A narrow fine tuning set pulls the model away from its broad prior knowledge.
  • High learning rates and long training amplify the drift.

The drift

Ways to reduce it

  • Lower learning rates and fewer epochs limit how far weights move.
  • Replaying some original or diverse data keeps old skills active.
  • Parameter efficient methods freeze the backbone, so the core knowledge is preserved by construction.
  • Regularization can penalize moving weights that mattered for prior tasks.

Key idea

Catastrophic forgetting is the loss of prior abilities when fine tuning overwrites shared weights, and it is mitigated by gentle updates, data replay, frozen backbones, or regularization.

Check yourself

Answer to earn rating on the learn ladder.

1. What is catastrophic forgetting?

2. Which approach helps prevent forgetting?