Partition and Cluster Keys

Two Levels of Organization

Partitioning divides a table into separate physical chunks by a key, often a date. A query filtered on that key reads only matching partitions. Clustering sorts data within storage by chosen columns so related rows sit near each other, tightening the value range of each block.

How They Speed Queries

Partition pruning: a filter like a date range skips entire partitions.
Cluster locality: sorted columns let the engine skip blocks whose ranges miss the filter.
Together they cut the bytes scanned, which often sets both cost and latency.

Choosing Keys

Partition on a column that queries filter on most, usually time. Avoid too many tiny partitions, which add overhead. Cluster on columns used in filters or joins that are not the partition key.

Key idea

Partition keys let queries skip whole chunks of a table while cluster keys let them skip blocks within a chunk, so choosing keys to match query filters minimizes bytes scanned.

Partition and Cluster Keys

Two Levels of Organization

How They Speed Queries

Choosing Keys

Key idea

Check yourself