The Recurrent Network Recap

Processing sequences

A recurrent neural network (RNN) reads a sequence one element at a time, updating a hidden state that summarizes everything seen so far.

At each step the new hidden state depends on the current input and the previous hidden state.
The same weights are reused at every step, called parameter sharing across time.
The hidden state acts as the network memory.

Training

RNNs are trained with backpropagation through time, which unrolls the loop into a deep chain and propagates gradients backward across steps.

The trouble

Long chains cause vanishing or exploding gradients, so plain RNNs struggle to learn dependencies that span many steps. This limitation motivated gated cells like the LSTM.

Key idea

An RNN threads a single hidden state through a sequence with shared weights, giving it memory but also long range gradient problems.

The Recurrent Network Recap

Processing sequences

Training

The trouble

Key idea

Check yourself