← Lessons

quiz vs the machine

Gold1400

System Design

Cache Key Normalization

Why the cache key shape decides whether equivalent requests share a hit.

5 min read · core · beat Gold to climb

What a cache key is

A cache key is the identity the CDN uses to look up a cached object. By default it may include the full URL, query string, some headers, and cookies. If two requests for the same content produce different keys, they cache separately and both miss.

How keys explode

  • Tracking query params like a campaign tag create a new key per visitor.
  • Volatile headers or cookies vary the key needlessly.
  • Mixed case or ordering in query strings splits identical requests.

Normalizing the key

  • Strip or whitelist query params so only meaningful ones stay.
  • Sort params to a canonical order.
  • Drop cookies and headers that do not change the response body.

The vary tradeoff

Sometimes content truly depends on a header, like language or device class. Use a controlled vary on just that dimension so you keep correct variants without exploding the key space. The goal is one key per distinct response, no more.

Key idea

Normalize cache keys so equivalent requests collapse to one entry, varying only on dimensions that truly change the response.

Check yourself

Answer to earn rating on the learn ladder.

1. Why do tracking query params hurt hit ratio?

2. When should you vary the cache key on a header?