Random Forests

The idea

A single deep decision tree has low bias but high variance. A random forest trains many trees and averages them, which keeps the low bias while sharply reducing variance.

Two sources of randomness

For the trees to help, their errors must be different from each other. Random forests inject randomness twice:

Bagging trains each tree on a bootstrap sample, a random draw of rows with replacement
At each split, only a random subset of features is considered, which decorrelates the trees

Combining predictions

For classification the forest takes a majority vote across trees. For regression it averages their outputs. Because the trees make different mistakes, the errors partly cancel.

Handy extras

The rows left out of each bootstrap form an out of bag set that gives a free validation estimate. Forests also rank feature importance.

Key idea

Random forests average many decorrelated trees built with bagging and random feature subsets, slashing variance without raising bias.

The idea

Two sources of randomness

Combining predictions

Handy extras

Key idea

Check yourself