The Machine Translation Deep

The task

Machine translation converts text from a source language to a target language while preserving meaning. Modern systems are neural and trained end to end on parallel sentence pairs.

Encoder decoder with attention

The encoder reads the source sentence into contextual vectors.
The decoder generates the target one token at a time.
Attention lets each target word focus on the relevant source words, which solved the bottleneck of squeezing a long sentence into one vector.

Transformers made this fully attention based and highly parallel.

Decoding the output

Greedy decoding takes the most likely next token, which can go astray.
Beam search keeps several partial translations and expands the best, trading compute for quality.

Evaluation and hard cases

BLEU scores n gram overlap with human references and is the long standing benchmark metric.
Low resource pairs lack data, so translation quality drops sharply.
Idioms and word order differences between languages remain hard.

Key idea

Neural machine translation uses an attention based encoder decoder trained on parallel data, decodes with beam search, and is scored with BLEU, while low resource pairs and idioms stay difficult.

The Machine Translation Deep

The task

Encoder decoder with attention

Decoding the output

Evaluation and hard cases

Key idea

Check yourself