Why percentiles, not averages
A mean hides the tail. The ninety ninth percentile, the value below which ninety nine percent of observations fall, captures the slow requests users actually feel. Metrics systems estimate percentiles from histograms or sketches.
Estimating from a histogram
Given cumulative bucket counts:
- Find the rank you want, such as the count times zero point nine nine.
- Locate the bucket whose cumulative count crosses that rank.
- Interpolate linearly within that bucket between its lower and upper boundary.
Accuracy is bounded by bucket width, so a wide bucket at the tail gives a fuzzy result.
Why you cannot average percentiles
Percentiles are not linear. Averaging the ninety ninth percentile of two instances does not give the combined ninety ninth percentile. You must aggregate the underlying buckets first, then compute the percentile once.
Sketches as an alternative
Data structures like t digest or DDSketch store compact summaries that merge correctly and give accurate quantiles without fixed buckets.
Key idea
Percentiles come from merging buckets or sketches first then interpolating, never from averaging per instance percentiles, which is mathematically wrong.