Sort and Aggregation Operators
Many queries need rows in order or grouped into summaries. The sort and aggregation operators handle these, and each has memory and disk variants.
Sorting
A sort operator orders rows by one or more keys. If the rows fit in memory it sorts in place. If not, it does an external sort, writing sorted runs to disk and merging them. Sort enables ORDER BY, merge joins, and some grouping.
Aggregation
An aggregate computes summaries like count, sum, or average, often per group. Two strategies dominate:
- Hash aggregation keeps a hash table keyed by group, updating totals as rows arrive. It needs no sorted input.
- Sort based aggregation first sorts by the group key, then sums adjacent equal keys in one pass.
The optimizer picks based on whether input is already sorted and how many groups exist.
Key idea
Sort orders rows in memory or via external runs, while aggregation summarizes groups using either a hash table or sorted input, chosen by cost.