← Lessons

quiz vs the machine

Gold1390

Machine Learning

The Instruction Following Eval

Checking that a model obeys explicit constraints, not just produces good content.

5 min read · core · beat Gold to climb

A different question

Quality evals ask whether the content is good. Instruction following evaluation asks a narrower question: did the model do exactly what it was told? Constraints like answer in three bullets, avoid the letter e, or reply only in JSON are checkable independently of content quality.

Why verifiable constraints help

Many instruction constraints can be checked by a simple program, no judge required:

  • Format, such as valid JSON or a required template.
  • Length, a word or sentence count.
  • Inclusion or exclusion of specific words or sections.
  • Style, such as a forced language or tone.

Programmatic checks are cheap, objective, and contamination resistant.

Composing constraints

Real prompts stack several constraints at once. Evals report the fraction of constraints satisfied and whether all of them held together. Following four rules individually but never all at once is a real and common failure.

Pitfalls

A model can satisfy every constraint while giving a useless answer, so instruction following must be paired with quality scoring. Ambiguous instructions are unverifiable, so good suites use constraints with a single clear pass condition.

Key idea

Instruction following evaluation uses programmatic checks of verifiable constraints to measure obedience precisely, but it must be paired with quality scoring since a model can obey every rule yet answer uselessly.

Check yourself

Answer to earn rating on the learn ladder.

1. Why are verifiable constraints attractive for instruction following evals?

2. Why must instruction following be paired with quality scoring?