← Lessons

quiz vs the machine

Platinum1820

System Design

Large Document Performance

Keeping a huge shared document fast as edits and metadata pile up.

6 min read · advanced · beat Platinum to climb

Where size hurts

A long collaborative document strains memory and CPU because every character may carry metadata like ids and tombstones, and the operation history grows without bound. Naive rendering and merging then slow to a crawl.

Key techniques

  • Block compression stores runs of consecutive characters under one id range instead of one record each.
  • Garbage collection of tombstones removes deleted markers once all replicas have seen them.
  • Snapshotting periodically collapses history so new clients do not replay millions of operations.

Rendering and loading

Large documents use virtualized rendering so only the visible portion is laid out, and lazy loading of sections so opening the file does not deserialize everything at once.

Tombstone collection is delicate, since removing a marker before a slow replica has seen it can break convergence, so it waits for a safe causal threshold.

Key idea

Large collaborative documents stay fast through block compression, safe tombstone collection, snapshots, and virtualized lazy rendering.

Check yourself

Answer to earn rating on the learn ladder.

1. What does block compression do for a text CRDT?

2. Why must tombstone collection wait for a causal threshold?

3. How does virtualized rendering help large documents?