The Variational Autoencoder

Turn an autoencoder into a true generator by learning a smooth probabilistic latent space.

The Variational Autoencoder

A variational autoencoder, or VAE, fixes the gaps in a plain autoencoder by forcing the latent space to follow a known distribution. This makes sampling new data meaningful.

Encoding to a distribution

Instead of one latent point, the encoder outputs a mean and a variance for each latent dimension.
A latent vector is drawn from that Gaussian, then decoded into a reconstruction.
This stochastic step makes nearby latents decode to similar outputs, smoothing the space.

The two loss terms

Reconstruction loss pushes the output to match the input.
A Kullback Leibler term pulls each encoded distribution toward a standard normal prior.
Together these form the evidence lower bound, or ELBO, which the VAE maximizes.

Why it generates

Because the latent space is regularized toward a standard normal, you can sample a random vector from that normal at test time, decode it, and get a plausible new example. The KL term is what makes random sampling work.

Key idea

A VAE encodes each input to a distribution and adds a KL term that pulls the latent space toward a standard normal, so random samples decode into plausible new data.

The Variational Autoencoder