Partitions are the parallelism unit
In a partitioned log the partition is the smallest unit of parallel consumption. A topic with eight partitions can be read by at most eight active consumers in a group working in parallel. Adding consumers beyond the partition count leaves some idle.
Choosing a partition count
- Too few caps your maximum parallelism and throughput.
- Too many raises overhead in metadata, open files, and rebalancing time.
- Pick enough headroom for future scaling, since reducing partitions later is disruptive.
The hot key problem
Partitioning by key spreads load only if keys are evenly distributed. A skewed key, such as one giant tenant, sends most traffic to a single partition. That partition becomes a hot spot while others sit idle, and its consumer falls behind.
Mitigations
- Salt the key: append a small random suffix to a hot key to fan it across several partitions, then merge results downstream.
- Composite keys: combine the natural key with another dimension to spread load while keeping needed ordering scope.
- Monitor lag per partition to detect skew early.
Key idea
Partitions cap how many consumers can work in parallel, so choose enough of them and pick a key that spreads load evenly, salting or compounding hot keys to avoid a single overloaded partition.