Partitions and clustering
In Cassandra a partition is identified by the partition key, and within it rows are ordered by clustering columns. A partition holding many such rows is a wide row or wide partition.
Why wide rows help
Storing many related rows in one partition makes them co located on the same nodes.
- A single query can fetch a whole time ordered slice with no scatter gather.
- Clustering columns define the on disk sort order, enabling range scans.
Sizing limits
Wide partitions are powerful but must be bounded.
- Aim to keep a partition under roughly 100 MB and well under 100000 rows as a practical guideline.
- An unbounded partition becomes a hot, oversized partition that slows reads and compaction.
Bucketing
To cap size you bucket the partition key, for example by adding a day or month to the key so each bucket stays small.
- A sensor reading table might use sensor id plus day as the partition key.
- Queries then target one bucket at a time.
Diagram
Key idea
Wide rows co locate many clustered rows for fast slice queries, but you must bucket the partition key to keep partitions within healthy size limits.