The Variational Autoencoder
A variational autoencoder, or VAE, fixes the gaps in a plain autoencoder by forcing the latent space to follow a known distribution. This makes sampling new data meaningful.
Encoding to a distribution
- Instead of one latent point, the encoder outputs a mean and a variance for each latent dimension.
- A latent vector is drawn from that Gaussian, then decoded into a reconstruction.
- This stochastic step makes nearby latents decode to similar outputs, smoothing the space.
The two loss terms
- Reconstruction loss pushes the output to match the input.
- A Kullback Leibler term pulls each encoded distribution toward a standard normal prior.
- Together these form the evidence lower bound, or ELBO, which the VAE maximizes.
Why it generates
Because the latent space is regularized toward a standard normal, you can sample a random vector from that normal at test time, decode it, and get a plausible new example. The KL term is what makes random sampling work.
Key idea
A VAE encodes each input to a distribution and adds a KL term that pulls the latent space toward a standard normal, so random samples decode into plausible new data.