What abstractive summarization does
Abstractive summarization generates a summary in fresh wording, compressing and paraphrasing the source rather than copying sentences. It reads more fluently but can invent details.
How it works
- A sequence to sequence model encodes the document and decodes a short summary token by token.
- Training pairs long documents with human written summaries and minimizes generation loss.
- Large pretrained encoder decoders give strong results with limited fine tuning.
The hallucination risk
Because the model writes new text, it can state facts not in the source. This is hallucination, and it is the central danger of abstractive systems.
- A summary may add a wrong date or attribute a quote to the wrong person.
- Faithfulness checks compare claims in the summary against the source.
Evaluation
- ROUGE measures overlap of n grams with reference summaries and correlates loosely with quality.
- Faithfulness metrics test whether the summary is entailed by the source, since high ROUGE can still hide invented facts.
Key idea
Abstractive summarization generates new fluent text that compresses the source, trading the copy guarantee of extraction for paraphrasing power at the cost of possible hallucination.