What a hallucination is
A hallucination is a fluent, confident output that is factually wrong or unsupported. The model is not lying. It is sampling a plausible continuation that happens to be false.
Why they happen
- The training objective rewards plausibility, not truth. A fluent guess scores like a fact.
- Knowledge is stored lossily in weights, so rare or recent facts are blurry.
- The model has no built in uncertainty signal that it must say I do not know.
- Pressure to be helpful and answer can push it to fabricate rather than abstain.
Where they cluster
- Specific names, dates, numbers, citations, and quotes are high risk because exact recall is hard.
- Long tail entities and post training events are common failure spots.
Reducing them
- Grounding the model in retrieved sources cuts reliance on fuzzy memory.
- Calibration and abstention training teach it to defer when unsure.
- Decoding choices and prompts that allow I do not know help.
Key idea
Hallucinations arise because models optimize plausibility not truth, store knowledge lossily, and lack a built in uncertainty signal, so grounding and abstention training are key mitigations.