Dead Versions Pile Up
Because MVCC keeps old versions, updates and deletes leave behind dead tuples that no live snapshot can see. Left alone they bloat tables and indexes and slow scans. Vacuum is the background process that reclaims them.
What Vacuum Does
- It scans pages and removes tuples whose xmax is older than the oldest active snapshot.
- It marks reclaimed space as free for new rows, usually without shrinking the file.
- It also advances freezing to protect against id wraparound.
- It updates a visibility map marking pages that are all visible to speed future scans.
Cost and Tuning
A full table scan every time would be expensive, so engines track dead tuple estimates and trigger autovacuum when a table crosses a threshold. Vacuum runs concurrently with normal queries, taking only light locks so it rarely blocks user work.
Key idea
Vacuum reclaims dead MVCC versions that no snapshot can see, frees space in place, advances freezing, and updates the visibility map, usually triggered automatically by dead tuple thresholds.