The simplification
Kappa architecture removes the batch layer entirely. There is one streaming pipeline that handles both real time and historical processing. The key enabler is a durable, replayable log like a partitioned commit log that retains events long term.
How reprocessing works
When logic changes, you do not run a separate batch job. Instead you replay the log from the beginning through a new version of the stream job, building a fresh output. Once it catches up, you switch traffic to it.
Benefits over lambda
- One codebase for both paths removes duplicated logic and drift.
- Reprocessing is just replaying the retained log, so correction is uniform.
- The same operators serve live and historical needs.
The constraints
- The log must retain enough history, which costs storage.
- Replaying very large histories can be slow and resource heavy.
Kappa fits well when the stream engine supports strong state and exactly once semantics, since the streaming job must match batch quality.
Key idea
Kappa architecture uses one streaming pipeline plus a replayable log, gaining simplicity by reprocessing history through the same code instead of a separate batch layer.