← Lessons

quiz vs the machine

Gold1500

Databases

Denormalization For Reads

Deliberately add redundancy to make read heavy queries faster.

5 min read · core · beat Gold to climb

Denormalization For Reads

Denormalization deliberately reintroduces redundancy that normalization removed, in exchange for faster reads. Fully normalized schemas can force many joins to assemble one view, which becomes expensive under heavy read traffic.

Common denormalization techniques:

  • Duplicated columns copy a parent value onto the child, such as storing the author name on each post to skip a join.
  • Precomputed aggregates store a running count or sum, such as a comment count on a post, so you do not aggregate on every read.
  • Materialized views persist the result of an expensive query and refresh it periodically.

The cost is that each duplicated fact now lives in more than one place. When the source changes you must update every copy, or the copies drift and become inconsistent. This shifts complexity from read time to write time, and someone must own keeping copies in sync, often a trigger, an application hook, or a scheduled job.

The decision is a tradeoff. Denormalize when reads vastly outnumber writes, when the joins are measurably slow, and when slightly stale data is acceptable. Keep the normalized tables as the source of truth and treat denormalized copies as a derived cache. That way you can always rebuild the redundant data from the authoritative rows.

Key idea

Denormalization trades extra write complexity and consistency risk for faster reads, so apply it only where reads dominate and keep a normalized source of truth.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the main benefit of denormalization?

2. What is the main cost of denormalization?

3. What should remain the source of truth when you denormalize?