← Lessons

quiz vs the machine

Silver1100

Machine Learning

R Squared and Adjusted R Squared

How much variance your model explains, and why raw R squared rewards clutter.

4 min read · intro · beat Silver to climb

A proportion of variance

R squared, the coefficient of determination, is the fraction of the variance in the target that the model explains, compared to just predicting the mean.

  • 1.0 means perfect prediction
  • 0 means no better than the mean
  • Negative is possible on test data when the model is worse than the mean

The inflation problem

Adding any feature, even pure noise, can only keep R squared the same or raise it. So a bigger model always looks better on training R squared, which tempts overfitting.

The adjustment

Adjusted R squared penalizes extra predictors. It rises only if a new feature improves the fit more than chance would. It can fall when you add a useless variable.

  • Use R squared to describe variance explained on a fixed model
  • Use adjusted R squared to compare models with different numbers of features

A caution

A high R squared does not prove a good model. It says nothing about bias in residuals, causation, or generalization. Always pair it with a residual plot.

Key idea

R squared measures variance explained but never decreases when you add features. Adjusted R squared corrects for feature count and is the fairer comparison.

Check yourself

Answer to earn rating on the learn ladder.

1. What happens to plain R squared when you add a useless noise feature?

2. When should you prefer adjusted R squared?