← Lessons

quiz vs the machine

Gold1360

Machine Learning

R Squared For Regression

How much variance your model explains, and why it can go negative.

4 min read · core · beat Gold to climb

A relative score for regression

Raw error depends on the scale of the target, so it is hard to judge alone. R squared, the coefficient of determination, gives a scale free score by comparing your model to a trivial baseline that always predicts the mean.

The comparison

R squared equals one minus the ratio of your model squared error to the squared error of the mean baseline.

  • An R squared of one means the model explains all the variance perfectly.
  • An R squared of zero means it does no better than always guessing the mean.
  • A negative R squared means the model is worse than the mean baseline.

Reading it carefully

  • R squared measures explained variance, not whether predictions are unbiased.
  • Adding more features can inflate R squared even when they are useless, so adjusted R squared penalizes extra features.
  • A high R squared on training data can collapse on new data, so always check it on a held out set.

Key idea

R squared compares your model against simply predicting the mean. One is perfect, zero matches the baseline, and a negative value means your model is worse than guessing the average.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a negative R squared mean?

2. Why use adjusted R squared?