Traces Are Big and Many
A busy system produces enormous trace volume. Keeping every span forever is unaffordable, so storage and retention policy decides what survives and for how long.
Levers You Control
- Sampling: store only a fraction of traces, the first defense against volume.
- Retention windows: keep recent traces in fast storage, expire old ones.
- Tiering: move older traces to cheaper, slower storage instead of deleting.
- Index vs raw split: keep lightweight searchable summaries longer than the full span detail.
A Tiered Lifecycle
Balancing Cost and Value
Recent traces are queried constantly during incidents, so they justify fast storage. Old traces are rarely opened but may be needed for compliance or trend analysis, which is what cheap cold tiers serve. A common compromise keeps full detail for days, searchable summaries for weeks, and aggregate stats for far longer.
Key idea
Trace retention balances cost against value using sampling, time based windows, and storage tiers, keeping recent traces fast and aging old ones into cheap archives.