Join Order Selection
When a query joins many tables, the order in which they are joined hugely affects cost. Join order selection is the optimizer task of choosing a good sequence so intermediate results stay small.
Why order matters
Joins produce intermediate tables that feed the next join. If an early join explodes into millions of rows, every later step pays for it. A good order joins the most selective pairs first to shrink data early.
The search problem
The number of possible orders grows extremely fast with table count. Optimizers use dynamic programming to build the best plan for each subset of tables, reusing smaller solutions. For very large joins they switch to greedy or randomized search.
- Early joins should keep intermediate results small.
- Cost depends on the estimated size of each intermediate.
- The order count grows factorially, so smart search is required.
Key idea
Join order selection picks a sequence that keeps intermediate results small, using dynamic programming because the space of orders is enormous.