The Temperature Top P Top K

Controlling randomness

When a model generates text it predicts a probability over next tokens. Sampling parameters decide how that distribution becomes a choice. The three common knobs are temperature, top p, and top k.

What each knob does

Temperature scales the distribution. Low values sharpen it toward the top token, giving focused deterministic output. High values flatten it, adding variety and risk.
Top k keeps only the k most likely tokens and samples among them, cutting off the long tail.
Top p keeps the smallest set of tokens whose probabilities sum to p, a nucleus that adapts to how confident the model is.

Choosing settings

For factual or structured tasks, use low temperature for consistency.
For creative tasks, raise temperature or top p for diversity.
Avoid stacking aggressive limits from several knobs at once, since they interact.

A practical note

These knobs change variety, not correctness. A confident model at low temperature can still be wrong, and high temperature does not add knowledge. Tune them to the task rather than chasing a single best setting.

Key idea

Temperature, top p, and top k shape how the next token distribution is sampled, trading focus for variety, but they tune diversity rather than correctness.

The Temperature Top P Top K

Controlling randomness

What each knob does

Choosing settings

A practical note

Key idea

Check yourself