← Lessons

quiz vs the machine

Silver1100

Concurrency

The Memory Hierarchy Latency

Why each step away from the core is dramatically slower than the last.

4 min read · intro · beat Silver to climb

Layers of storage

A modern machine stores data at several levels, each larger but slower than the one above:

  • Registers are inside the core and the fastest.
  • L1 cache is small and takes a few cycles.
  • L2 and L3 caches are larger and take tens of cycles.
  • Main memory takes hundreds of cycles.

Orders of magnitude

The jump between levels is large. An L1 hit may cost a few cycles while a trip to main memory can cost two orders of magnitude more. Because of this gap, a program that misses cache often is bound by memory latency, not by arithmetic.

Why locality wins

Caches hold recently used and nearby data. Code that reuses values and walks memory in order keeps work in the fast levels. This is why sequential access and reuse, known as temporal and spatial locality, matter so much for concurrent and serial code alike.

Key idea

Each step down the memory hierarchy is much slower, so keeping hot data in cache through locality often matters more than raw computation.

Check yourself

Answer to earn rating on the learn ladder.

1. Roughly how does main memory latency compare to an L1 hit?

2. Why does spatial locality help performance?