The Random Forest Tuning

The handful of knobs that actually move a random forest, from tree count to feature sampling.

What a forest controls

A random forest averages many decorrelated trees, each grown on a bootstrap sample with random feature subsets at each split. A few hyperparameters shape that ensemble.

The knobs that matter

Number of trees more trees lower variance and never hurt accuracy, only cost. Use enough that the score plateaus.
Max features the count of features tried per split. Fewer features mean more decorrelated, more diverse trees.
Max depth and min samples per leaf limit individual tree complexity to control overfitting.

Using out of bag error

Each tree skips about a third of the data, its out of bag samples. Predicting those gives a free validation estimate without a separate holdout, handy for quick tuning.

Practical guidance

Start by raising tree count until the out of bag score stops improving.
Then tune max features, the single most impactful diversity knob.
Random forests are forgiving, so depth limits rarely need fine tuning unless overfitting is severe.

Key idea

The high impact random forest knobs are tree count, which only helps until it plateaus, and max features, which controls tree diversity. Out of bag error gives a free validation score for tuning.

The Random Forest Tuning

What a forest controls

The knobs that matter

Using out of bag error

Practical guidance

Key idea

Check yourself