Subqueries That Depend On The Outer Row
A normal subquery runs once and is independent. A correlated subquery references a column from the outer query, so logically it runs once per outer row. This power comes with a cost.
How It Behaves
- The outer query produces a candidate row.
- The inner query runs using values from that row, such as matching the same customer id.
- The outer row is kept or dropped based on the inner result.
Common Uses
- Find rows that beat the per group average, like orders larger than that customer average.
- Test existence of a related row with EXISTS.
- Pull a single matching value, like the latest order date for each customer.
The Performance Risk
Because it conceptually reruns for every outer row, a naive correlated subquery can be slow on large tables, a pattern sometimes called the N plus one problem in SQL. Many planners can flatten it into a join or aggregate, but when they cannot, rewriting it as a join with a grouped subquery is often far faster.
Key idea
A correlated subquery reads the outer row and logically reruns per row, which is expressive but can be slow, so consider rewriting hot ones as joins.