← Lessons

quiz vs the machine

Gold1410

Machine Learning

GRU Cells

A simpler gated recurrent cell with two gates and no separate cell state.

5 min read · core · beat Gold to climb

What it is

A gated recurrent unit is a streamlined gated recurrent cell. It keeps the memory benefits of an LSTM but uses fewer parts, merging the cell state and hidden state into one vector.

The two gates

A GRU uses two gates instead of three.

  • The update gate decides how much of the old hidden state to keep versus replace with new content
  • The reset gate decides how much of the past to ignore when forming the new candidate state

The update gate acts like a blend control. When it stays near keep, the hidden state passes through almost unchanged, which preserves long range memory.

How it compares to LSTM

A GRU and an LSTM solve the same long range memory problem with different trade offs.

  • A GRU has fewer parameters, so it trains a little faster and needs less data
  • An LSTM has a separate cell state and an extra gate, giving it more capacity
  • In practice the two often perform similarly, so the choice is empirical

Key idea

A GRU uses an update gate and a reset gate over a single state to keep long range memory with fewer parameters than an LSTM.

Check yourself

Answer to earn rating on the learn ladder.

1. How many gates does a GRU use?

2. What is one advantage of a GRU over an LSTM?