← Lessons

quiz vs the machine

Gold1400

Databases

Statistics and Histograms

The data summaries that power good row count estimates.

5 min read · core · beat Gold to climb

Statistics and Histograms

Good estimates need good summaries. Databases collect statistics about each column and store histograms that describe how values are distributed.

Basic statistics

For each column the engine often tracks the row count, the number of distinct values, the fraction of nulls, and the minimum and maximum. These let it estimate simple filters quickly.

Histograms

A histogram divides a column range into buckets and records how many rows fall in each. This captures skew, where some values are far more common than others. An equi depth histogram makes each bucket hold roughly the same number of rows so dense regions get finer detail.

  • Distinct counts estimate equality selectivity.
  • Histograms estimate range selectivity and handle skew.
  • Stale statistics cause bad estimates, so engines refresh them.

Key idea

Statistics and histograms summarize column distributions so the optimizer can estimate selectivity, and keeping them fresh is essential for good plans.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a histogram capture about a column?

2. What does an equi depth histogram aim for?

3. Why must statistics be refreshed?