Autoencoder

Motivation

An autoencoder is a neural network trained to reconstruct its input. It consists of an encoder \(f_\phi : \mathcal{X} \to \mathcal{Z}\) that maps an input \(x\) to a low-dimensional representation \(z\), and a decoder \(g_\theta : \mathcal{Z} \to \mathcal{X}\) that maps \(z\) back to \(\hat x\). The training objective minimizes a reconstruction loss like \(\|x - g_\theta(f_\phi(x))\|^2\) (Hinton and Salakhutdinov 2006; Goodfellow et al. 2016).

The point is not the reconstruction itself — the identity function reconstructs perfectly — but the bottleneck: by making \(\dim(\mathcal{Z}) < \dim(\mathcal{X})\), the network is forced to discover a compressed representation that captures the most informative axes of variation in the data. Autoencoders are the conceptual ancestor of variational autoencoders, of which most of this article is a setup.

Architecture

A symmetric encoder-decoder pair:

\[ z = f_\phi(x), \qquad \hat x = g_\theta(z). \]

Both \(f_\phi\) and \(g_\theta\) are typically MLPs (for tabular data) or CNN/transposed-CNN pairs (for images). The latent dimension \(\dim(z)\) is much smaller than \(\dim(x)\) — for MNIST images (\(28^2 = 784\) pixels), latents of dimension \(2\) to \(32\) are typical.

The training objective is reconstruction loss, e.g.,

\[ L(\phi, \theta) = \mathbb{E}_{x \sim p_{\text{data}}}\!\left[\|x - g_\theta(f_\phi(x))\|^2\right]. \]

Pixel-level losses for images, MSE for continuous data, cross-entropy for binary data.

Diagram: encoder–bottleneck–decoder hourglass

The encoder narrows the input down to a low-dimensional latent \(z\); the decoder mirrors the encoder, expanding \(z\) back to \(\hat x\). The bottleneck — the narrow waist — is what forces the network to learn a compressed representation.

Connection to PCA

A linear autoencoder with a squared-error loss recovers principal component analysis. Concretely, if \(f_\phi(x) = W_e x\) and \(g_\theta(z) = W_d z\) with \(\dim(z) = k\), the optimal \(W_e\), \(W_d\) project onto the top \(k\) principal components of the data covariance.

This is the linear case. Nonlinear autoencoders generalize PCA in the same way that neural networks generalize linear regression: they can find lower-dimensional representations of curved manifolds in \(\mathcal{X}\) that PCA cannot.

Variants

Denoising autoencoders (Vincent et al., 2008) corrupt the input — by adding Gaussian noise, masking pixels, or other transformations — and train the network to reconstruct the clean input. The objective becomes

\[ L = \mathbb{E}_{x, \tilde x}\!\left[\|x - g_\theta(f_\phi(\tilde x))\|^2\right], \]

where \(\tilde x\) is the corrupted input. The network must learn to recover the underlying signal, not just memorize. A foundational technique with surprising depth — denoising training is the basis of modern diffusion models and is connected to score matching via the Vincent identity.

Sparse autoencoders add an \(\ell_1\) penalty on the latent activations to encourage most of them to be zero. The bottleneck is implicit (most units inactive) rather than explicit (small dimension).

Contractive autoencoders add a penalty on the Frobenius norm of the encoder Jacobian, encouraging \(f_\phi\) to be insensitive to small input perturbations.

Variational autoencoders make the latent variable probabilistic and connect to a generative model. See variational autoencoder. This is the most consequential variant.

What Autoencoders Cannot Do

A trained autoencoder is a deterministic encoder/decoder pair. Two limitations as a generative model:

No probabilistic interpretation. The latent space has no defined density; sampling \(z\) from a prior and decoding gives outputs that are not principled samples from any distribution.
No coverage guarantee. The latent codes assigned to training data may occupy a complicated subset of \(\mathcal{Z}\). Drawing \(z\) from a Gaussian prior may land outside this region, where the decoder produces garbage.

Both problems are solved by the VAE, which augments the autoencoder with a probabilistic encoder and a KL regularizer that pulls the encoder distribution toward a prior on \(\mathcal{Z}\).

When to Use a Plain Autoencoder

For pure dimensionality reduction or denoising, plain autoencoders are useful. They are simpler than VAEs and often produce sharper outputs (because they do not have the noise-injection step at the latent). They are not generative models in the probabilistic sense, but they do learn a useful representation.

For generation, use a VAE, GAN, normalizing flow, or diffusion model. The plain autoencoder is the conceptual ancestor of all of these but is not by itself a generative model.

References

Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. https://www.deeplearningbook.org/.

Hinton, G. E., and R. R. Salakhutdinov. 2006. “Reducing the Dimensionality of Data with Neural Networks.” Science 313 (5786): 504–7. https://doi.org/10.1126/science.1127647.