Many paths, one answer
Self consistency improves chain of thought by sampling several independent reasoning paths at a nonzero temperature, then taking the most common final answer. Different paths may reach the same conclusion through different routes, and the majority vote tends to be more reliable than any single sample.
Why voting helps
- Errors are diverse, so wrong paths often disagree with each other.
- Correct answers converge, since valid reasoning lands in the same place.
- Robustness rises because one unlucky sample no longer decides the result.
How to run it
- Generate several completions with temperature above zero for variety.
- Extract the final answer from each path.
- Return the answer that appears most often.
Tradeoffs
This costs several times the tokens and latency of a single call, so it suits high value questions rather than every request. It works best when the final answer is a discrete value you can tally, such as a number or a label, rather than free form prose.
Key idea
Self consistency samples multiple reasoning paths and returns the majority answer, raising reliability for discrete answers at the cost of more compute.