← Lessons

quiz vs the machine

Gold1420

Machine Learning

The Self Consistency Deep

Sampling many reasoning paths and voting to get a more reliable answer.

5 min read · core · beat Gold to climb

One chain can be unlucky

A single chain of thought may take a wrong turn. Self consistency samples several independent reasoning paths for the same question and then picks the answer most of them agree on. It trades extra compute for robustness.

The recipe

  • Use a chain of thought prompt and a nonzero temperature so paths differ.
  • Sample several completions, each with its own reasoning.
  • Extract the final answer from each path.
  • Take the majority vote over those answers.

Why voting helps

Correct reasoning tends to converge on the same answer through different routes, while errors scatter. The mode of many samples is therefore more often right than any single greedy decode, especially on problems with one clean final value.

When it fits

It shines on tasks with a short discrete answer that is easy to compare, such as a number or a label. For open ended writing there is no clean vote, so the method does not apply as cleanly and costs add up fast.

Key idea

Self consistency samples several reasoning paths and takes the majority answer, since correct chains converge while errors scatter, buying reliability on discrete answer tasks at the cost of extra compute.

Check yourself

Answer to earn rating on the learn ladder.

1. How does self consistency pick its final answer?

2. Why must the temperature be nonzero?

3. Where does self consistency fit best?