← Lessons

quiz vs the machine

Silver1030

Machine Learning

Descriptive Statistics Mean Median Mode

The three ways to summarize the center of a dataset.

4 min read · intro · beat Silver to climb

Descriptive Statistics Mean Median Mode

Before modeling anything, you need to summarize your data. The three classic measures of central tendency describe where the center of a distribution sits.

The three measures

  • The mean is the arithmetic average: add every value and divide by the count.
  • The median is the middle value when the data is sorted, splitting it into two equal halves.
  • The mode is the value that appears most often.

When each one helps

The mean uses every value, so it is precise but sensitive to outliers. A single huge salary drags the mean income upward.

The median ignores magnitudes and only cares about rank, so it is robust to outliers. This is why incomes and house prices are usually reported as medians.

The mode is the only summary that works for categorical data, like the most common eye color, where averaging makes no sense.

A worked feel

For the values 2, 3, 3, 8, the mean is 4, the median is 3, and the mode is 3. The lone 8 pulls the mean above the rest of the cluster.

Key idea

Mean, median, and mode each describe a dataset center, and the median resists outliers that distort the mean.

Check yourself

Answer to earn rating on the learn ladder.

1. Which measure is most robust to extreme outliers?

2. For the values 2, 3, 3, 8 the mean equals what?

3. Which measure works for purely categorical data like favorite color?