← Lessons

quiz vs the machine

Gold1400

Machine Learning

The Encoder Decoder

One network reads the input, another writes the output sequence.

4 min read · core · beat Gold to climb

Two stages

An encoder decoder architecture maps a variable length input to a variable length output using two cooperating networks.

  • The encoder reads the full input and compresses it into a representation.
  • The decoder generates the output one token at a time from that representation.

This pattern powers machine translation, summarization, and speech to text.

The bottleneck

In early sequence to sequence models the encoder squeezed everything into a single fixed length vector. Long inputs overflowed this bottleneck, hurting quality. Attention later let the decoder look back at all encoder states.

The decoder is usually autoregressive: each generated token feeds back as input for the next step.

Key idea

Encoder decoder splits sequence transduction into reading and writing, where a fixed bottleneck once limited quality until attention removed it.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the role of the decoder?

2. What weakness did the fixed length context vector cause?