Bounding Recovery Work
After a crash, recovery must replay the log to redo changes that never reached their data pages. Without a limit, that log could be enormous. A checkpoint writes a marker that lets recovery start from a recent point instead of the beginning.
What a Checkpoint Does
- It records the set of active transactions and the oldest log position they need.
- It flushes or accounts for dirty pages so their changes are durable up to some LSN.
- It writes a checkpoint record to the log naming the safe starting LSN.
Fuzzy Checkpoints
A naive checkpoint that stalls all writes hurts throughput. Modern engines use fuzzy checkpoints that flush dirty pages gradually in the background while transactions keep running. The checkpoint record then captures a consistent starting point even though flushing was spread over time.
Key idea
A checkpoint records active transactions and a safe LSN so recovery replays only recent log, and fuzzy checkpoints flush dirty pages gradually to avoid stalling writes.