The Reparameterization Trick
Training a VAE requires backpropagating through a random sampling step. The reparameterization trick rewrites that sampling so gradients can pass through it.
The problem
- The VAE needs to sample a latent vector from a Gaussian whose mean and variance come from the encoder.
- Sampling is random, and you cannot take a gradient through a raw random draw. The chain of derivatives breaks.
The fix
- Move the randomness to an external noise variable drawn from a fixed standard normal.
- Build the latent as mean plus standard deviation times noise.
- Now the mean and variance enter through deterministic arithmetic, while the only random part is the fixed noise that carries no parameters.
Why it works
Gradients flow cleanly through the mean and standard deviation because they appear in a differentiable formula. The random noise is treated as a constant during each backward pass. This single rewrite is what lets the encoder learn by gradient descent.
Key idea
The reparameterization trick expresses a latent sample as mean plus standard deviation times fixed external noise, making the sampling step differentiable so gradients reach the encoder.