Why hit ratio matters
The cache hit ratio is the share of requests served from cache rather than the origin. A higher ratio means lower latency, less origin load, and lower egress cost. Even a few points of improvement can sharply cut origin traffic.
What lowers it
- Short TTLs force frequent revalidation and refills.
- Cache fragmentation from many key variants splits one object into many cold entries.
- Long tail content that few users request never warms up.
Levers to raise it
- Lengthen TTLs where content changes rarely, and revalidate instead of refetching.
- Normalize cache keys so equivalent requests share one entry.
- Tiered caching lets a regional cache absorb misses so the origin sees only the deepest misses.
Measuring it
Track hit ratio per PoP and per object class. A global average can hide a hot path that misses constantly. Watch byte hit ratio too, since large objects dominate egress even if request counts look healthy.
Key idea
Raise hit ratio by lengthening TTLs, normalizing keys, and tiering caches, and measure both request and byte hit ratios to find weak paths.