State as a first class citizen
Many stream computations are stateful. Counting events per user, joining two streams, or detecting patterns all require remembering past data. Flink treats this state as a core primitive that operators own and manage directly.
Keyed state
Most state in Flink is keyed state, partitioned by a key like user id. Flink routes all events for a key to the same operator instance, which keeps that key's state locally. This lets state scale by partitioning across the cluster.
Local state backends
State lives in a state backend. It can sit in heap memory for speed or in an embedded key value store on local disk for large state that exceeds memory. Keeping state local avoids a network call per event.
Consistency under failure
Because the job runs forever, state must survive crashes. Flink periodically snapshots all operator state into durable storage so it can restore a consistent view and resume without losing or double counting.
Key idea
Flink makes state a first class, locally held, key partitioned primitive and snapshots it durably so long running stateful streams scale and recover correctly.