The id problem at scale
Many systems need a unique id for every new record. A single database counter is simple but becomes a bottleneck and a single point of failure. The snowflake approach lets many machines mint unique ids independently.
How a snowflake id is built
A snowflake id is a 64 bit number split into parts:
- A timestamp in the high bits, milliseconds since a chosen start time.
- A machine id identifying the worker that generated it.
- A sequence number that increments for ids made in the same millisecond on that machine.
Combining these three guarantees uniqueness. No two machines share a machine id, and the sequence separates ids made in the same millisecond.
Why teams like it
- Ids are generated locally, no central coordination on the hot path.
- The timestamp prefix makes ids roughly time sortable, useful for ordering and indexing.
- It scales horizontally by handing each worker a distinct machine id.
The catches
- Clocks must not run backward, or ids could collide or unsort.
- The sequence space per millisecond is finite, capping ids per machine per millisecond.
Key idea
A snowflake id packs a timestamp, a machine id, and a per millisecond sequence into one 64 bit number, letting many machines mint unique, roughly time sortable ids without a central counter.