The Supervised Fine Tuning

From text predictor to assistant

A base model continues text but does not reliably follow instructions. Supervised fine tuning (SFT) fixes this by training on curated examples of prompts paired with high quality responses.

The data

Human writers or vetted sources produce demonstrations: an instruction and an ideal answer.
Examples cover question answering, summarizing, coding, refusing unsafe requests, and more.
The same next token loss is used, but now over these formatted dialogues.

Why it shifts behavior

The model learns the format of helpful turn taking, including system and user roles.
It learns to produce answers rather than merely continue the prompt.
Refusal demonstrations teach it to decline some categories of requests.

Limits of SFT alone

SFT only imitates the demonstrations it sees, so coverage gaps remain.
It cannot easily express that one answer is better than another by degree, only that the demonstrated answer is correct.
This is why preference based methods usually follow SFT.

Key idea

Supervised fine tuning trains the base model on curated instruction and response pairs so it follows instructions and adopts helpful, safe formats, but it only imitates the demonstrations provided.

The Supervised Fine Tuning

From text predictor to assistant

The data

Why it shifts behavior

Limits of SFT alone

Key idea

Check yourself