Why a window
Deduplication filters out repeated messages by remembering ids it has already processed. Remembering every id forever is impossible, so systems keep a deduplication window, a bounded span of recent ids.
How the window works
Each message carries a unique id. The receiver keeps ids seen within the window and rejects any repeat.
- A time based window keeps ids for a fixed duration, say five minutes.
- A count based window keeps the last N ids regardless of time.
If a duplicate arrives inside the window it is dropped. If it arrives after the window has expired, the system has forgotten the id and will process it again.
Choosing the size
The window must outlast the longest realistic retry. If a sender retries for up to ten minutes but the window is five, late duplicates slip through.
- Larger windows catch more duplicates but cost more memory.
- Smaller windows are cheap but leak late retries.
Key idea
A deduplication window trades perfect duplicate filtering for bounded memory, so it must be at least as long as the longest retry interval.