← Lessons

quiz vs the machine

Gold1400

Machine Learning

The Convexity And Local Minima

Convex bowls have one minimum, while bumpy surfaces hide many traps.

5 min read · core · beat Gold to climb

Convex versus nonconvex

A function is convex if any line segment between two points on its graph stays above the graph. A convex loss looks like a single bowl.

  • Convex: every local minimum is the global minimum.
  • Gradient descent on a convex loss reaches the best solution.

Local minima

A local minimum is a point lower than its neighbors but not necessarily the lowest overall. Nonconvex losses, like those of deep networks, can have many.

  • Descent may settle into a local minimum.
  • Different starting points can reach different minima.

Practical view

In deep learning the surface is highly nonconvex, yet training often works. Many local minima reach similar low loss, and the bigger obstacles are often saddle points rather than bad minima.

Knowing whether a loss is convex tells you whether you are guaranteed the best answer or merely a good one.

Key idea

Convex losses have a single global minimum that descent always finds, while nonconvex losses hide many local minima, though in deep nets many of them are good enough.

Check yourself

Answer to earn rating on the learn ladder.

1. For a convex loss, every local minimum is also what?

2. What is true of a typical deep network loss surface?