The unifying idea
T5 casts every task as text to text: the input is a string and the output is a string, whether the task is translation, summarization, or classification.
- It uses the full encoder decoder transformer.
- The encoder reads the input; the decoder generates the answer as text.
- Task prefixes in the input tell the model what to do.
Pretraining
T5 is pretrained with a span corruption objective. Contiguous spans of tokens are masked and the model learns to reconstruct the missing spans as a target sequence.
Why it matters
A single uniform format means one model, loss, and decoding procedure across many tasks. This simplifies multitask training and transfer learning.
Because it keeps both encoder and decoder, T5 combines strong input understanding with flexible generation.
Key idea
T5 unifies NLP as text to text over an encoder decoder, pretrained by reconstructing corrupted spans, so one model handles many tasks with one format.