← Lessons

quiz vs the machine

Silver1110

System Design

The Heartbeat and Reconnect

Detecting dead connections with periodic pings and recovering them with backoff based reconnects.

4 min read · intro · beat Silver to climb

The silent failure problem

A long lived connection can die without either side noticing, because a dropped network path sends no close packet. The socket looks open but messages never arrive. Heartbeats solve this by sending a tiny message on a fixed interval.

How heartbeats work

  • The client or server sends a ping every few seconds and expects a pong back.
  • If several pongs are missed in a row, the peer is declared dead and the connection is closed.
  • The interval trades detection speed against extra traffic.

Reconnecting safely

  • When the link drops, reconnect with exponential backoff so a server outage is not hit by a stampede.
  • Add jitter to spread retries across clients.
  • On reconnect, resend any unacknowledged messages and resume from the last seen position.

Key idea

Heartbeats turn invisible connection death into a detectable event, and backoff with jitter lets clients reconnect without overwhelming a recovering server.

Check yourself

Answer to earn rating on the learn ladder.

1. Why are heartbeats needed on long lived connections?

2. What prevents a reconnect stampede after an outage?