Classifier Free Guidance

Steer diffusion samples toward a prompt by mixing conditional and unconditional predictions.

Classifier Free Guidance

Classifier free guidance is the technique that lets diffusion models follow a text prompt strongly without a separate classifier. It steers generation by blending two predictions.

The two predictions

During training the model randomly drops the condition sometimes, so it learns both a conditional and an unconditional denoiser in one network.
At sampling time it runs both, predicting noise with the prompt and without it.
The difference between them points toward the prompt.

The guidance scale

Form the guided prediction as the unconditional prediction pushed in the conditional direction.
A guidance scale controls how hard to push. Scale one means no guidance.
Higher scales make output match the prompt more closely but can reduce diversity and over saturate.

Why it matters

It removes the need to train and differentiate a noisy classifier, which was fragile.
It is the main knob behind prompt adherence in modern text to image systems.
Tuning the scale trades fidelity to the prompt against sample variety.

Key idea

Classifier free guidance trains one network to denoise with and without the condition, then amplifies their difference by a guidance scale to steer samples toward the prompt, trading diversity for fidelity.

Classifier Free Guidance

Classifier Free Guidance

The two predictions

The guidance scale

Why it matters

Key idea

Check yourself