← Lessons

quiz vs the machine

Platinum1800

Machine Learning

LoRA Adapters

Fine tuning huge models by training tiny low rank weight updates.

5 min read · advanced · beat Platinum to climb

The motivation

Full fine tuning of a giant model updates billions of weights, which is expensive and produces a huge copy per task. LoRA, short for low rank adaptation, makes this cheap by training only small add on matrices.

The low rank trick

Instead of changing the original weight matrix, LoRA learns a small update expressed as the product of two thin matrices.

  • The big pretrained weights stay frozen
  • A pair of low rank matrices learns the change
  • Their product is added to the frozen weights at inference
  • Only these small matrices are trained and saved

Because the update has low rank, it needs a tiny fraction of the parameters, often well under one percent.

Why it is popular

LoRA adapters are small files you can swap per task while sharing one base model. This is a form of parameter efficient fine tuning. A common variant called QLoRA combines LoRA with quantization so even very large models can be tuned on a single GPU.

Key idea

LoRA freezes the base model and trains tiny low rank matrices as the weight update, enabling cheap swappable fine tuning of large models.

Check yourself

Answer to earn rating on the learn ladder.

1. What does LoRA train instead of the full weight matrix?

2. What happens to the original pretrained weights in LoRA?

3. What does QLoRA add to LoRA?