The Score Based Models

Generate data by following the gradient of log density, the score, through noise levels.

The Score Based Models

Score based models offer another lens on diffusion. Instead of predicting noise, they learn the score, the gradient of the log probability density, and use it to walk toward high density regions.

What the score is

The score at a point is the direction in which data density increases fastest.
A network called a score network estimates this gradient for noisy data at many noise levels.
Training uses denoising score matching, which turns out to be equivalent to predicting noise in diffusion models.

Sampling by following the score

Start from random noise.
Repeatedly nudge the sample along the estimated score, adding a little randomness each step.
This procedure, Langevin dynamics, drifts samples toward regions where real data is dense.

Why multiple noise levels

At low noise the score is sharp but only accurate near real data.
At high noise the score is smooth and guides samples from anywhere.
Annealing from high to low noise combines broad guidance with fine detail, unifying with the diffusion view.

Key idea

Score based models learn the gradient of log density and sample by following that score with Langevin dynamics across annealed noise levels, a formulation equivalent to denoising diffusion.

The Score Based Models

The Score Based Models

What the score is

Sampling by following the score

Why multiple noise levels

Key idea

Check yourself