Cardinality Estimation
Cardinality estimation is the optimizer guess of how many rows each operator will emit. These row count estimates feed cost, so a bad estimate can lead the optimizer to a bad plan.
Why it drives cost
The cost of a join or sort depends mostly on how many rows it touches. If the optimizer thinks a filter returns ten rows but it really returns a million, it may pick a nested loop join that becomes catastrophically slow.
How it estimates
The optimizer combines statistics with assumptions:
- A filter on a column uses its selectivity, the fraction of rows expected to pass.
- Multiple predicates are often assumed independent, multiplying selectivities.
- Join sizes come from input sizes and key distributions.
These assumptions can compound errors, so estimates drift as plans grow taller.
Key idea
Cardinality estimation predicts row counts from statistics and assumptions, and because cost depends on these counts, estimation errors are a leading cause of slow plans.