R Squared For Regression

A relative score for regression

Raw error depends on the scale of the target, so it is hard to judge alone. R squared, the coefficient of determination, gives a scale free score by comparing your model to a trivial baseline that always predicts the mean.

The comparison

R squared equals one minus the ratio of your model squared error to the squared error of the mean baseline.

An R squared of one means the model explains all the variance perfectly.
An R squared of zero means it does no better than always guessing the mean.
A negative R squared means the model is worse than the mean baseline.

Reading it carefully

R squared measures explained variance, not whether predictions are unbiased.
Adding more features can inflate R squared even when they are useless, so adjusted R squared penalizes extra features.
A high R squared on training data can collapse on new data, so always check it on a held out set.

Key idea

R squared compares your model against simply predicting the mean. One is perfect, zero matches the baseline, and a negative value means your model is worse than guessing the average.

R Squared For Regression

A relative score for regression

The comparison

Reading it carefully

Key idea

Check yourself