← Lessons

quiz vs the machine

Silver1050

Machine Learning

The Linear Regression Assumptions

The four assumptions that make ordinary least squares valid and trustworthy.

4 min read · intro · beat Silver to climb

What linear regression promises

Ordinary least squares fits a line by minimizing squared residuals. Its estimates are only unbiased and efficient when a few assumptions hold. Knowing them tells you when to trust the model and when to fix it.

The core assumptions

  • Linearity the mean of the target is a linear function of the features.
  • Independence the residuals are not correlated with each other.
  • Homoscedasticity the residual variance is constant across all fitted values.
  • Normality the residuals are roughly normal, which matters mainly for confidence intervals.

Checking the fit

When assumptions break

  • A curved residual plot suggests adding polynomial or interaction terms.
  • Fanning variance suggests a log transform of the target or weighted least squares.
  • Correlated residuals, common in time series, call for models that account for autocorrelation.

These diagnostics turn a black box fit into something you can reason about.

Key idea

Least squares is unbiased and efficient only under linearity, independence, constant variance, and normal residuals. Residual plots reveal which assumption broke and which fix to apply.

Check yourself

Answer to earn rating on the learn ladder.

1. What does homoscedasticity require?

2. A curved residual versus fitted plot most directly signals a violation of which assumption?