The Self Consistency Deep

One chain can be unlucky

A single chain of thought may take a wrong turn. Self consistency samples several independent reasoning paths for the same question and then picks the answer most of them agree on. It trades extra compute for robustness.

The recipe

Use a chain of thought prompt and a nonzero temperature so paths differ.
Sample several completions, each with its own reasoning.
Extract the final answer from each path.
Take the majority vote over those answers.

Why voting helps

Correct reasoning tends to converge on the same answer through different routes, while errors scatter. The mode of many samples is therefore more often right than any single greedy decode, especially on problems with one clean final value.

When it fits

It shines on tasks with a short discrete answer that is easy to compare, such as a number or a label. For open ended writing there is no clean vote, so the method does not apply as cleanly and costs add up fast.

Key idea

Self consistency samples several reasoning paths and takes the majority answer, since correct chains converge while errors scatter, buying reliability on discrete answer tasks at the cost of extra compute.

The Self Consistency Deep

One chain can be unlucky

The recipe

Why voting helps

When it fits

Key idea

Check yourself