← Lessons

quiz vs the machine

Silver1080

System Design

Latency Percentiles p50 and p99

Why an average hides pain and percentiles tell the real story of user experience.

4 min read · intro · beat Silver to climb

Why averages lie

The mean latency of a service can look healthy while many users suffer. A few very slow requests barely move the average, so teams report percentiles instead.

Reading a percentile

A percentile names the value below which a share of requests fall.

  • p50 is the median, half of requests are faster.
  • p90 means nine in ten are faster.
  • p99 is the slow tail, only one in a hundred is worse.

If p50 is 20 ms but p99 is 800 ms, most calls feel snappy yet one percent stall badly.

Why the tail matters

A single page often fans out to many backend calls. The slowest one decides the page time, so a user is exposed to the tail far more than the median suggests.

Measuring well

Percentiles do not average across machines, so compute them from raw samples or use histograms. Track p50, p99, and p999 together to see both the typical and the worst case.

Key idea

Watch percentiles not averages, because the tail at p99 is what frustrated users actually feel.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a p99 latency of 800 ms mean?

2. Why can the average latency look healthy while users suffer?