The Tradeoff
Normalized schemas store each fact exactly once, which keeps writes simple and avoids inconsistency. But answering a read may require joining many tables. Denormalization deliberately duplicates data so a read can be satisfied with fewer joins, trading write effort and storage for read speed.
Common Techniques
- Precomputed columns such as a stored comment count instead of counting rows each time.
- Embedded copies such as storing an author name alongside each post.
- Materialized aggregates maintained as data changes.
The Cost You Take On
- Every duplicated value must be updated everywhere when the source changes.
- If updates miss a copy, the data becomes inconsistent.
- More storage and more write amplification.
When It Is Worth It
Denormalize when a read is hot, the join is expensive, and the underlying data changes rarely relative to how often it is read. Author names change far less often than posts are displayed, so embedding them usually pays off.
Key idea
Denormalization speeds reads by duplicating data, accepting write amplification and inconsistency risk in exchange for fewer joins.