The Distributed Cache Design

Why a single cache is not enough

A cache stores hot data in memory so reads skip the slow database. One cache box runs out of memory and becomes a single point of failure. A distributed cache spreads keys across many nodes for capacity and resilience.

Placing keys on nodes

The challenge is deciding which node holds which key. Plain modulo hashing reshuffles everything when a node is added or removed.

Consistent hashing maps keys and nodes onto a ring. A key belongs to the next node clockwise. Adding or removing a node only moves the keys near it, not the whole keyspace, keeping the cache mostly warm.

Eviction

Cache memory is finite, so it must evict entries:

Least recently used drops the entry untouched the longest.
Time to live expires entries after a set age.
These keep the cache holding the most useful data.

Consistency concerns

A cached value can go stale when the source changes, so writes invalidate or update the cache.
A cache miss falls through to the database, which then populates the cache.

Key idea