← Lessons

quiz vs the machine

Gold1390

System Design

Autoscaling Policies

Choose how a fleet grows and shrinks: by metric targets, schedules, or predictive forecasts.

5 min read · core · beat Gold to climb

What autoscaling decides

An autoscaling policy decides when to add or remove instances. The goal is to track demand closely enough to stay responsive without paying for idle capacity.

Common policy types

  • Target tracking holds a metric near a setpoint, for example keeping average CPU at sixty percent by adding nodes when it climbs.
  • Step scaling adds different amounts depending on how far past a threshold the metric is.
  • Scheduled scaling changes capacity at known times, like scaling up before a daily peak.
  • Predictive scaling forecasts demand from history and provisions ahead of it.

Tuning that matters

  • Cooldowns stop the controller from reacting before new instances warm up.
  • Warm up time means reactive scaling always lags a sudden spike, so headroom or prediction covers the gap.
  • Flapping happens when up and down thresholds sit too close, churning instances.

Key idea

Autoscaling policies trade cost against responsiveness, so combine reactive target tracking with cooldowns and either headroom or prediction to absorb the delay before new capacity is ready.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a target tracking policy do?

2. Why does reactive autoscaling lag a sudden spike?