Autoscaling Policies

Choose how a fleet grows and shrinks: by metric targets, schedules, or predictive forecasts.

What autoscaling decides

An autoscaling policy decides when to add or remove instances. The goal is to track demand closely enough to stay responsive without paying for idle capacity.

Common policy types

Target tracking holds a metric near a setpoint, for example keeping average CPU at sixty percent by adding nodes when it climbs.
Step scaling adds different amounts depending on how far past a threshold the metric is.
Scheduled scaling changes capacity at known times, like scaling up before a daily peak.
Predictive scaling forecasts demand from history and provisions ahead of it.

Tuning that matters

Cooldowns stop the controller from reacting before new instances warm up.
Warm up time means reactive scaling always lags a sudden spike, so headroom or prediction covers the gap.
Flapping happens when up and down thresholds sit too close, churning instances.

Key idea

Autoscaling policies trade cost against responsiveness, so combine reactive target tracking with cooldowns and either headroom or prediction to absorb the delay before new capacity is ready.

Autoscaling Policies

What autoscaling decides

Common policy types

Tuning that matters

Key idea

Check yourself