What linear regression promises
Ordinary least squares fits a line by minimizing squared residuals. Its estimates are only unbiased and efficient when a few assumptions hold. Knowing them tells you when to trust the model and when to fix it.
The core assumptions
- Linearity the mean of the target is a linear function of the features.
- Independence the residuals are not correlated with each other.
- Homoscedasticity the residual variance is constant across all fitted values.
- Normality the residuals are roughly normal, which matters mainly for confidence intervals.
Checking the fit
When assumptions break
- A curved residual plot suggests adding polynomial or interaction terms.
- Fanning variance suggests a log transform of the target or weighted least squares.
- Correlated residuals, common in time series, call for models that account for autocorrelation.
These diagnostics turn a black box fit into something you can reason about.
Key idea
Least squares is unbiased and efficient only under linearity, independence, constant variance, and normal residuals. Residual plots reveal which assumption broke and which fix to apply.