Requirements
- Deliver events to customer registered HTTP endpoints.
- Retry failed deliveries and never silently drop events.
- Handle slow or down endpoints without blocking others.
High level design
Events are queued and delivered by workers with retries and dead lettering.
- Ingestion: an event is persisted and enqueued for delivery.
- Delivery workers: pull events, sign the payload, POST to the endpoint, and record the result.
- Retries: failed deliveries are retried with exponential backoff up to a limit, then sent to a dead letter queue.
- Isolation: per endpoint queues prevent one slow customer from blocking others.
Bottlenecks
- Slow endpoints: per endpoint queues and timeouts stop one customer from starving capacity.
- Duplicate delivery: at least once delivery means receivers should dedupe by event id.
- Ordering: strict ordering needs a single in flight delivery per endpoint, which limits throughput.
Tradeoffs
- At least once is simpler than exactly once but pushes dedup to the receiver.
- Strict ordering limits parallelism, so many systems offer best effort ordering.
Key idea
A webhook system persists events then delivers them through per endpoint queues with signed payloads, backoff retries, and a dead letter queue for poison events.