← Lessons

quiz vs the machine

Gold1470

System Design

Distributed Tracing

Following one request across dozens of services to find where time went.

5 min read · core · beat Gold to climb

The lost request

In a microservice system, one user request might touch twenty services. When it is slow, logs from each service alone cannot tell you the full story. Distributed tracing stitches the whole journey back together.

Traces and spans

  • A trace represents one request as it travels through the system.
  • Each unit of work within it is a span, with a start time and duration.
  • Spans nest, so you can see that the database call inside the order service took most of the time.

Context propagation

The magic is a trace id passed along in request headers from service to service. Every span records the same trace id, so a collector can reassemble them into one timeline even though they came from many machines.

A trace view immediately reveals which span dominated the latency, turning a mystery into a pinpointed bottleneck.

Key idea

Distributed tracing propagates a trace id across services so one request becomes a single timeline of spans you can analyze.

Check yourself

Answer to earn rating on the learn ladder.

1. What is a span in distributed tracing?

2. How are spans from different services linked into one trace?