A distribution over functions
A Gaussian process is a nonparametric model that defines a distribution over functions. Instead of fixing a form like a line, it assumes any finite set of function values is jointly Gaussian, shaped by a mean and a covariance kernel.
Prediction with uncertainty
Given training points, the GP conditions its prior on the observed values to produce a posterior. Each prediction comes with both a mean and a variance, so the model says not only what it expects but how confident it is.
The kernel shapes everything
- The kernel encodes assumptions like smoothness and length scale, how quickly the function can change.
- A common choice is the RBF kernel, giving smooth functions.
- Kernel hyperparameters are tuned by maximizing the marginal likelihood.
Strengths and limits
- GPs shine with small data and where calibrated uncertainty matters, such as Bayesian optimization.
- Uncertainty grows naturally far from observed points.
- The catch is cost, which scales with the cube of the number of points, so plain GPs do not handle large datasets without approximations.
Key idea
A Gaussian process is a kernel defined distribution over functions that gives predictions with calibrated uncertainty. It excels on small data but scales cubically, requiring approximations for large datasets.