The pragmatic default
Most reliable pipelines choose at least once delivery. The sender keeps retrying until it gets an acknowledgement, accepting that some messages arrive more than once. Deduplication on the receiving side removes the extra copies.
How deduplication works
- Every message carries a stable unique id, set by the producer, not the transport.
- The consumer keeps a seen set of ids that have already been processed.
- Before acting, it checks the set. A known id is dropped, a new id is processed and recorded.
The seen set cannot grow forever, so it uses a time window or a sliding range of ids. Messages older than the window are assumed never to repeat.
Tradeoffs
Deduplication adds storage and a lookup on every message. The window must be longer than the longest possible retry delay, or a late duplicate slips through. Choosing the id wisely, such as a business key rather than a random token, makes dedup meaningful across producers.
Key idea
At least once delivery plus a deduplication store with stable ids and a bounded window gives reliable processing without losing messages.