Distributed tracing
In a microservice system a single user request may touch a dozen services. When it is slow, logs from one service cannot tell you where the time went. Distributed tracing stitches the whole journey together so you can see the path and the timing.
Traces and spans
A trace represents one request end to end and carries a unique trace id. Within a trace, each unit of work is a span. A span records a name, a start time, a duration, and a parent span, so spans nest into a tree that mirrors the call graph.
Context propagation
The magic is context propagation. When service A calls service B, it passes the trace id and the current span id inside request headers. Service B reads those headers and creates a child span under the same trace. Without propagation each service would generate isolated, unjoinable data.
What it reveals
- The critical path and which span dominated the latency
- Where calls happened in parallel versus serially
- Errors attached to a specific span deep in the chain
Key idea
A trace links spans across services through propagated context, revealing exactly where a request spent its time.