Recurrent Neural Networks

What it is

A recurrent neural network processes a sequence one element at a time while carrying a hidden state forward. The hidden state acts as a memory that summarizes everything seen so far.

How a step works

At each step the network takes the current input and the previous hidden state and produces a new hidden state.

The new state mixes the fresh input with the running memory
The same weights are applied at every step, which is weight sharing across time
An output can be read from the hidden state at any step

Because the same function repeats, a recurrent network can handle sequences of any length with a fixed set of weights.

Training and its limits

Recurrent networks train with backpropagation through time, which unrolls the loop into a deep chain and propagates gradients backward.

This makes them strong at language, audio, and time series
But long chains suffer from vanishing gradients, so plain recurrent networks struggle to learn long range dependencies

This weakness is exactly what gated cells like the LSTM were designed to fix.

Key idea

A recurrent network carries a hidden state across a sequence, reusing one set of weights at every step.

Recurrent Neural Networks

What it is

How a step works

Training and its limits

Key idea

Check yourself