Aggregating an infinite stream
A stream never ends, so you cannot wait for all data to count it. Windowing groups records into bounded time slices so you can compute sums, counts, or averages over each slice.
Common window types
- Tumbling windows are fixed size and do not overlap. A new five minute window starts when the last one ends, so each record belongs to exactly one window.
- Sliding windows are fixed size but advance by a smaller step, so windows overlap and a record can fall into several.
- Session windows group bursts of activity separated by a gap of inactivity, so their length varies per key.
Choosing a window
Tumbling suits clean periodic reports. Sliding suits smooth moving averages. Session suits user activity where you want to capture a visit until the user goes idle.
Key idea
Windowing slices an endless stream into bounded chunks, and tumbling, sliding, and session windows trade off overlap and fixed versus activity based length.