Sampling traces
Tracing every request in a busy system would produce a crushing volume of data. Sampling keeps only a fraction of traces while still giving useful insight. The question is which fraction and how to choose it.
Head based sampling
In head based sampling the decision is made when the request starts, before the outcome is known. A simple rule keeps, say, one in a hundred traces. It is cheap and easy because the choice rides along in the context. The weakness is that rare errors and slow requests are usually discarded along with everything else.
Tail based sampling
In tail based sampling the system buffers spans and decides after the trace finishes. Now it can keep the interesting ones, such as every error or any request slower than a threshold, while dropping routine fast successes. This catches the traces you actually want but needs memory to hold spans until the trace completes.
Choosing
- Head based is simple and cheap but blind to outcomes
- Tail based is smarter but costs buffering and coordination
- Many systems combine both, sampling broadly at the head and forcing keeps for errors at the tail
Key idea
Head sampling decides cheaply at the start but misses outliers, while tail sampling buffers and keeps the errors and slow traces that matter.