Cost Based Query Planning

Choosing among equivalent execution plans using statistics to estimate which is cheapest to run.

Many plans, one query

A single declarative query can run in many ways. The optimizer can reorder joins, pick a broadcast or shuffle join, and choose scan methods. These alternatives are logically equivalent but have wildly different costs. A cost based optimizer estimates each plan's cost and picks the cheapest.

What feeds the estimate

Statistics like row counts, distinct values, and column histograms.
Cardinality estimation predicts how many rows each operator emits, which drives downstream cost.
A cost model converts estimated rows into expected work for cpu, memory, and network.

Why join order dominates

Join order is the biggest lever. Joining two large tables early can explode intermediate rows, while filtering first keeps them small. The optimizer searches the space of orders guided by cardinality estimates.

The fragility

Cost based planning is only as good as its statistics. Stale or missing stats cause bad cardinality estimates and disastrous plans, which is why engines refresh stats and sometimes adapt plans at runtime.

Key idea

A cost based optimizer enumerates equivalent plans and uses statistics and cardinality estimates to choose the cheapest, with join order as the dominant and most fragile decision.

Cost Based Query Planning

Many plans, one query

What feeds the estimate

Why join order dominates

The fragility

Key idea

Check yourself