← Lessons

quiz vs the machine

Gold1500

System Design

Design a Log Aggregation Pipeline

Collect, ship, and index logs from many services for search and alerting.

6 min read · core · beat Gold to climb

Requirements

  • Collect logs from thousands of hosts and services.
  • Make logs searchable with low query latency.
  • Buffer bursts so spikes do not lose data.

High level design

Agents ship logs through a buffer into a search index and archive.

  • Collection: a lightweight agent on each host tails files and forwards lines.
  • Buffer: a partitioned log such as Kafka decouples producers from consumers and absorbs bursts.
  • Processing: consumers parse, enrich, and route logs to an index for search and to cheap object storage for long term retention.

Bottlenecks

  • Burst volume: an incident causes a log storm, so the buffer absorbs spikes and applies backpressure.
  • Index cost: full text indexing is expensive, so index recent hot data and archive older logs cheaply.
  • Cardinality: high cardinality fields balloon the index, so sample or drop noisy fields.

Tradeoffs

  • Indexing everything enables rich search but costs heavily in storage and compute.
  • Tiering hot to cold cuts cost but makes old log queries slower.

Key idea

A log pipeline ships logs through a buffering log into a hot search index and a cold archive, tiering data so recent logs are fast and old logs are cheap.

Check yourself

Answer to earn rating on the learn ladder.

1. Why place a partitioned log buffer between agents and processors?

2. Why tier logs into a hot index and a cold archive?