← Lessons

quiz vs the machine

Gold1450

Machine Learning

LSTM Cells

A gated recurrent cell with a cell state that preserves long range memory.

6 min read · core · beat Gold to climb

What it is

A long short term memory cell is a recurrent unit designed to remember information over many steps. It adds a separate cell state that carries memory along the sequence with little change unless the cell decides otherwise.

The three gates

An LSTM controls its memory with three learned gates, each a small layer that outputs values between zero and one.

  • The forget gate decides what to erase from the cell state
  • The input gate decides what new information to write
  • The output gate decides what part of the cell state to expose as the hidden state

Because the cell state mostly flows straight through, gradients survive across many steps. This is why LSTMs learn long range dependencies that plain recurrent networks miss.

Why the design works

The key trick is an additive path for the cell state rather than a repeated multiplication.

  • Adding new content instead of multiplying avoids the vanishing gradient problem
  • Gates let the cell keep a fact for a long time, then drop it when no longer needed
  • The same gated function repeats at every step

Key idea

An LSTM uses forget, input, and output gates around a protected cell state to hold memory across long sequences.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the forget gate control?

2. Why do LSTMs handle long range dependencies better than plain recurrent networks?