From compression to generation
A plain autoencoder learns a code good for reconstruction but with a tangled, gappy structure, so picking a random code rarely decodes into something realistic. A variational autoencoder, or VAE, reshapes the code into a smooth probability space you can sample from to generate new data.
A probabilistic code
Instead of mapping each input to a single point, the encoder outputs a distribution, a mean and a spread, in the latent space:
- A code is sampled from that distribution and decoded
- The training loss has two parts that pull against each other
- A reconstruction term rewards faithful rebuilding
- A regularization term, based on the KL divergence, pulls every input distribution toward a shared standard normal prior
This regularizer packs the codes together without gaps, so sampling a random point from the prior decodes into a plausible new example.
Tradeoffs
VAEs give a smooth, navigable latent space and stable training, great for interpolation and representation learning. Their samples are often blurrier than those from adversarial or diffusion models, a price of the reconstruction objective.
Key idea
A variational autoencoder encodes inputs as distributions regularized toward a normal prior, creating a smooth latent space you can sample to generate new data.