← Lessons

quiz vs the machine

Gold1450

Machine Learning

Self Consistency Decoding

Sampling many reasoning paths and voting on the answer.

5 min read · core · beat Gold to climb

Many paths, one answer

Self consistency improves chain of thought by sampling several independent reasoning paths at a nonzero temperature, then taking the most common final answer. Different paths may reach the same conclusion through different routes, and the majority vote tends to be more reliable than any single sample.

Why voting helps

  • Errors are diverse, so wrong paths often disagree with each other.
  • Correct answers converge, since valid reasoning lands in the same place.
  • Robustness rises because one unlucky sample no longer decides the result.

How to run it

  • Generate several completions with temperature above zero for variety.
  • Extract the final answer from each path.
  • Return the answer that appears most often.

Tradeoffs

This costs several times the tokens and latency of a single call, so it suits high value questions rather than every request. It works best when the final answer is a discrete value you can tally, such as a number or a label, rather than free form prose.

Key idea

Self consistency samples multiple reasoning paths and returns the majority answer, raising reliability for discrete answers at the cost of more compute.

Check yourself

Answer to earn rating on the learn ladder.

1. How does self consistency decoding choose its answer?

2. Self consistency works best when the answer is

3. What is the main cost of self consistency?