← Lessons

quiz vs the machine

Gold1340

Machine Learning

Decision Trees

Models that split data into regions with simple yes or no questions.

5 min read · core · beat Gold to climb

What it is

A decision tree predicts by asking a sequence of yes or no questions about the features. Each internal node tests one feature against a threshold, and each leaf gives a prediction. The path from root to leaf is a chain of simple rules.

How it learns

The tree grows greedily. At each node it searches for the split that best separates the data, measured by a purity criterion:

  • Gini impurity or entropy for classification
  • Variance reduction for regression

It keeps splitting until a stopping rule, like a maximum depth or a minimum number of samples per leaf, kicks in.

Strengths and weaknesses

  • Easy to interpret and visualize
  • Handles mixed feature types and needs little preprocessing
  • Prone to overfitting if grown too deep, since a deep tree can memorize noise

Controlling complexity

Pruning and depth limits curb overfitting. A single tree is rarely best on its own, which motivates ensembles like random forests.

Key idea

A decision tree splits data with greedy purity based questions, is highly interpretable, and needs depth limits or pruning to avoid memorizing noise.

Check yourself

Answer to earn rating on the learn ladder.

1. What does each internal node of a decision tree do?

2. A common failure mode of deep trees is?