← Lessons

quiz vs the machine

Gold1450

System Design

Sampling Traces

Keeping a representative subset of traces to control cost.

5 min read · core · beat Gold to climb

Sampling traces

Tracing every request in a busy system would produce a crushing volume of data. Sampling keeps only a fraction of traces while still giving useful insight. The question is which fraction and how to choose it.

Head based sampling

In head based sampling the decision is made when the request starts, before the outcome is known. A simple rule keeps, say, one in a hundred traces. It is cheap and easy because the choice rides along in the context. The weakness is that rare errors and slow requests are usually discarded along with everything else.

Tail based sampling

In tail based sampling the system buffers spans and decides after the trace finishes. Now it can keep the interesting ones, such as every error or any request slower than a threshold, while dropping routine fast successes. This catches the traces you actually want but needs memory to hold spans until the trace completes.

Choosing

  • Head based is simple and cheap but blind to outcomes
  • Tail based is smarter but costs buffering and coordination
  • Many systems combine both, sampling broadly at the head and forcing keeps for errors at the tail

Key idea

Head sampling decides cheaply at the start but misses outliers, while tail sampling buffers and keeps the errors and slow traces that matter.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the main weakness of head based sampling?

2. Why does tail based sampling cost more resources?