← Lessons

quiz vs the machine

Gold1450

Machine Learning

The Reparameterization Trick

Make sampling differentiable so gradients can flow through a stochastic latent layer.

4 min read · core · beat Gold to climb

The Reparameterization Trick

Training a VAE requires backpropagating through a random sampling step. The reparameterization trick rewrites that sampling so gradients can pass through it.

The problem

  • The VAE needs to sample a latent vector from a Gaussian whose mean and variance come from the encoder.
  • Sampling is random, and you cannot take a gradient through a raw random draw. The chain of derivatives breaks.

The fix

  • Move the randomness to an external noise variable drawn from a fixed standard normal.
  • Build the latent as mean plus standard deviation times noise.
  • Now the mean and variance enter through deterministic arithmetic, while the only random part is the fixed noise that carries no parameters.

Why it works

Gradients flow cleanly through the mean and standard deviation because they appear in a differentiable formula. The random noise is treated as a constant during each backward pass. This single rewrite is what lets the encoder learn by gradient descent.

Key idea

The reparameterization trick expresses a latent sample as mean plus standard deviation times fixed external noise, making the sampling step differentiable so gradients reach the encoder.

Check yourself

Answer to earn rating on the learn ladder.

1. Why is a raw sampling step a problem for training?

2. How does the reparameterization trick express the latent?