← Lessons

quiz vs the machine

Gold1420

System Design

The Kappa Architecture Deep Dive

Using a single streaming pipeline for everything, reprocessing history by replaying the log.

5 min read · core · beat Gold to climb

The simplification

Kappa architecture removes the batch layer entirely. There is one streaming pipeline that handles both real time and historical processing. The key enabler is a durable, replayable log like a partitioned commit log that retains events long term.

How reprocessing works

When logic changes, you do not run a separate batch job. Instead you replay the log from the beginning through a new version of the stream job, building a fresh output. Once it catches up, you switch traffic to it.

Benefits over lambda

  • One codebase for both paths removes duplicated logic and drift.
  • Reprocessing is just replaying the retained log, so correction is uniform.
  • The same operators serve live and historical needs.

The constraints

  • The log must retain enough history, which costs storage.
  • Replaying very large histories can be slow and resource heavy.

Kappa fits well when the stream engine supports strong state and exactly once semantics, since the streaming job must match batch quality.

Key idea

Kappa architecture uses one streaming pipeline plus a replayable log, gaining simplicity by reprocessing history through the same code instead of a separate batch layer.

Check yourself

Answer to earn rating on the learn ladder.

1. What does kappa architecture remove compared to lambda?

2. How does kappa reprocess historical data?

3. What is a key cost of kappa?