quiz vs the machine

Gold1500

System Design

Design a Log Aggregation Pipeline

Collect, ship, and index logs from many services for search and alerting.

6 min read · core · beat Gold to climb

Requirements

Collect logs from thousands of hosts and services.
Make logs searchable with low query latency.
Buffer bursts so spikes do not lose data.

High level design

Agents ship logs through a buffer into a search index and archive.

Collection: a lightweight agent on each host tails files and forwards lines.
Buffer: a partitioned log such as Kafka decouples producers from consumers and absorbs bursts.
Processing: consumers parse, enrich, and route logs to an index for search and to cheap object storage for long term retention.

Bottlenecks

Burst volume: an incident causes a log storm, so the buffer absorbs spikes and applies backpressure.
Index cost: full text indexing is expensive, so index recent hot data and archive older logs cheaply.
Cardinality: high cardinality fields balloon the index, so sample or drop noisy fields.

Tradeoffs

Indexing everything enables rich search but costs heavily in storage and compute.
Tiering hot to cold cuts cost but makes old log queries slower.

Key idea

A log pipeline ships logs through a buffering log into a hot search index and a cold archive, tiering data so recent logs are fast and old logs are cheap.

Check yourself

Answer to earn rating on the learn ladder.

1. Why place a partitioned log buffer between agents and processors?

2. Why tier logs into a hot index and a cold archive?