← Lessons

quiz vs the machine

Gold1400

System Design

The Lambda Architecture Deep Dive

Combining a batch layer for accuracy and a speed layer for freshness, merged at query time.

5 min read · core · beat Gold to climb

The motivation

Lambda architecture answers a single question with two pipelines. A batch layer recomputes accurate results over all historical data, while a speed layer processes recent events for low latency. A serving layer merges both at query time.

The three layers

  • Batch layer stores the immutable master dataset and periodically recomputes batch views. These are slow but correct.
  • Speed layer processes only recent data the batch layer has not yet covered, producing real time views that fill the gap.
  • Serving layer indexes both views and combines them so a query reflects complete plus fresh data.

The cost

The big drawback is duplicated logic. The same aggregation must be written and maintained twice, once in batch and once in streaming, with different frameworks and subtle behavior differences.

Lambda was popular before stream engines could reprocess history reliably. It guarantees eventual correctness because batch overwrites approximate speed results.

Key idea

Lambda architecture pairs an accurate batch layer with a fast speed layer and merges them, accepting duplicated logic for both freshness and correctness.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the speed layer contribute?

2. What is the main drawback of lambda architecture?

3. How does lambda achieve eventual correctness?