← Lessons

quiz vs the machine

Gold1400

System Design

Partitioning for Parallelism

Use partitions as the unit of parallel work and avoid hot keys that skew the load.

5 min read · core · beat Gold to climb

Partitions are the parallelism unit

In a partitioned log the partition is the smallest unit of parallel consumption. A topic with eight partitions can be read by at most eight active consumers in a group working in parallel. Adding consumers beyond the partition count leaves some idle.

Choosing a partition count

  • Too few caps your maximum parallelism and throughput.
  • Too many raises overhead in metadata, open files, and rebalancing time.
  • Pick enough headroom for future scaling, since reducing partitions later is disruptive.

The hot key problem

Partitioning by key spreads load only if keys are evenly distributed. A skewed key, such as one giant tenant, sends most traffic to a single partition. That partition becomes a hot spot while others sit idle, and its consumer falls behind.

Mitigations

  • Salt the key: append a small random suffix to a hot key to fan it across several partitions, then merge results downstream.
  • Composite keys: combine the natural key with another dimension to spread load while keeping needed ordering scope.
  • Monitor lag per partition to detect skew early.

Key idea

Partitions cap how many consumers can work in parallel, so choose enough of them and pick a key that spreads load evenly, salting or compounding hot keys to avoid a single overloaded partition.

Check yourself

Answer to earn rating on the learn ladder.

1. What limits the number of active parallel consumers in a group?

2. What causes a hot partition?

3. How can salting a hot key help?