← Lessons

quiz vs the machine

Gold1480

System Design

The Trace Aggregation Backend

Where spans from thousands of machines are gathered, reassembled, and made queryable.

5 min read · core · beat Gold to climb

The Reassembly Problem

Spans are emitted independently by many services and arrive at the backend out of order and at different times. The aggregation backend collects them and rebuilds each trace from its trace id.

The Pipeline

  • Ingest: receive spans over the network, often through a queue to absorb bursts.
  • Assemble: group spans by trace id and link them by parent id into a tree.
  • Index: build lookups by service, operation, duration, and tags so queries are fast.
  • Store: persist the assembled trace for later retrieval.

Late and Missing Spans

Because spans trickle in, the backend cannot know a trace is truly complete. It uses a time window: after a span goes quiet for a while, the trace is treated as done. A span arriving after that window is late and may be dropped or appended.

Key idea

The aggregation backend ingests out of order spans, groups them by trace id into trees, indexes them for query, and uses a time window to decide when a trace is complete.

Check yourself

Answer to earn rating on the learn ladder.

1. How does the backend group spans into a single trace?

2. Why does the backend use a time window when assembling a trace?