Heads find roles
When you inspect a trained transformer, individual attention heads often take on interpretable roles. They are not assigned these jobs, they discover them through training.
Recurring head types
- Previous token heads attend mostly to the immediately preceding token.
- Positional heads attend to a fixed relative offset.
- Syntactic heads link words by grammatical relations, like verb to subject.
- Induction heads copy a pattern by finding a prior occurrence and predicting what followed it.
Why induction heads matter
Induction heads are a known mechanism behind in context learning. They scan back for a repeated token, look at what came after it last time, and bias the model to repeat that continuation. This emerges as models grow and explains some few shot ability.
The bigger picture
Not all heads are crisp, and many can be pruned with little loss, suggesting redundancy. Still, the existence of clean, reusable head roles is a window into how transformers organize computation and a foundation of mechanistic interpretability.
Key idea
Attention heads spontaneously specialize into roles like previous token, positional, syntactic, and induction heads, and induction heads in particular underpin in context learning, giving interpretability a foothold into transformer computation.