Scoring continuous errors
Regression predicts numbers, so the error is a residual, the gap between prediction and truth. How you summarize those residuals changes which model wins.
The four metrics
- MAE is the mean absolute error, the average size of the residual. Robust to outliers and in the original units
- MSE is the mean squared error. Squaring punishes large errors much more, so it is outlier sensitive
- RMSE is the square root of MSE, back in original units but still outlier sensitive
- MAPE is the mean absolute percentage error, a relative measure useful across different scales
Choosing
- Use MAE when all errors matter equally and outliers are noise
- Use RMSE when large errors are disproportionately costly
- Use MAPE to compare across series of different magnitudes, but beware it blows up near zero actuals
A subtle point
MAE is minimized by the median prediction while MSE is minimized by the mean. So the metric you optimize quietly changes the target your model aims for.
Key idea
MAE is robust and median seeking, RMSE punishes big misses and seeks the mean, and MAPE is scale free but unstable near zero. Match the metric to the cost of errors.