← Lessons

quiz vs the machine

Gold1390

System Design

Graceful Shutdown and Draining

Letting in flight work finish before a server exits so deploys cause no errors.

5 min read · core · beat Gold to climb

The deploy problem

During a deploy or scale down, instances stop. If a server is killed while handling requests, those requests fail and users see errors. Graceful shutdown avoids this by letting work in progress complete.

The drain sequence

When a server receives a stop signal, it should not exit immediately. It drains first.

  • Stop accepting new connections and tell the load balancer it is unhealthy.
  • Finish the requests already in flight.
  • Close pools and flush buffers, then exit cleanly.

Why the order matters

The load balancer needs time to notice the instance is leaving and route new traffic elsewhere. So a brief pause after marking unhealthy, before refusing connections, prevents a window where traffic still arrives but is rejected.

Bound the wait

Give draining a deadline. If requests do not finish within a grace period, force exit so a single stuck request cannot block the deploy forever.

Key idea

Graceful shutdown drains by leaving the load balancer, finishing in flight work, then exiting within a bounded grace period so deploys cause no user facing errors.

Check yourself

Answer to earn rating on the learn ladder.

1. What does draining do during a graceful shutdown?

2. Why mark the instance unhealthy before refusing connections?