← Lessons

quiz vs the machine

Silver1050

Machine Learning

The Gradient Descent Intuition

Follow the slope downhill to minimize a loss one step at a time.

4 min read · intro · beat Silver to climb

The picture

Imagine the loss as a hilly landscape over the parameters. Gradient descent walks downhill by repeatedly stepping in the direction that lowers the loss fastest.

  • The gradient points in the direction of steepest increase.
  • We step in the opposite direction, scaled by the learning rate.
  • We repeat until the gradient is near zero.

The update rule

Each step computes the gradient of the loss with respect to every parameter, then moves each parameter a small amount against its gradient. Small steps trace a smooth path toward a minimum.

  • A large step can overshoot the valley.
  • A tiny step is safe but slow.

Where it goes

On a smooth surface the path curves toward a low point. The slope flattens as we approach a minimum, so steps naturally shrink near the bottom.

Gradient descent is the engine behind most model training, from linear regression to deep networks.

Key idea

Gradient descent minimizes a loss by repeatedly stepping against the gradient, letting the local slope guide each move toward a valley.

Check yourself

Answer to earn rating on the learn ladder.

1. In gradient descent we move parameters in which direction?

2. What scales the size of each step?