← Lessons

quiz vs the machine

Silver1100

Machine Learning

The Instruction Tuning

Teaching a base model to follow natural language instructions.

4 min read · intro · beat Silver to climb

From predictor to assistant

A base language model only predicts the next token. It can complete text but will not reliably follow an instruction like summarize this or answer this question. Instruction tuning fine tunes the model on many examples of instructions paired with good responses.

The dataset shape

Each example contains an instruction, optional input, and a target response.

  • Examples cover diverse tasks such as translation, summarization, and reasoning.
  • The model learns the general skill of mapping a request to a helpful answer.
  • Variety matters more than any single task, so the behavior generalizes.

The training flow

Why it generalizes

Because the data spans many tasks phrased as instructions, the model learns a format and intent rather than memorizing answers. It then follows instructions for tasks it never saw explicitly, a key step toward a usable assistant.

Key idea

Instruction tuning fine tunes a base model on diverse instruction and response pairs so it learns to follow natural language requests and generalize to new tasks.

Check yourself

Answer to earn rating on the learn ladder.

1. What does instruction tuning teach a model to do?

2. Why is task diversity important in instruction tuning?