The Cost Function Intuition
Training a model means searching for parameters that make its predictions match reality. The cost function, also called the loss, is the single number that scores how wrong the model currently is.
What it measures
- For regression, mean squared error averages the squared gap between prediction and target.
- For classification, cross entropy punishes confident wrong answers heavily.
- A cost of zero means perfect fit on the training set, which is rarely desirable.
Why a scalar matters
Optimization needs one quantity to push down. By collapsing every example into one averaged score, the cost function turns learning into a well defined search over a landscape of parameter values. Each point in that landscape has a height equal to the loss, and gradient based methods walk downhill.
Shaping behavior
The choice of cost encodes what you care about. Squared error chases outliers because their errors are squared. Cross entropy rewards calibrated probabilities. Picking the wrong cost quietly teaches the model the wrong priorities, so the cost function is a design decision, not a formula you copy blindly.
Key idea
The cost function compresses all errors into one scalar that optimization can drive toward a minimum.