Percentile Computation

Why percentiles, not averages

A mean hides the tail. The ninety ninth percentile, the value below which ninety nine percent of observations fall, captures the slow requests users actually feel. Metrics systems estimate percentiles from histograms or sketches.

Estimating from a histogram

Given cumulative bucket counts:

Find the rank you want, such as the count times zero point nine nine.
Locate the bucket whose cumulative count crosses that rank.
Interpolate linearly within that bucket between its lower and upper boundary.

Accuracy is bounded by bucket width, so a wide bucket at the tail gives a fuzzy result.

Why you cannot average percentiles

Percentiles are not linear. Averaging the ninety ninth percentile of two instances does not give the combined ninety ninth percentile. You must aggregate the underlying buckets first, then compute the percentile once.

Sketches as an alternative

Data structures like t digest or DDSketch store compact summaries that merge correctly and give accurate quantiles without fixed buckets.

Key idea