← Lessons

quiz vs the machine

Platinum1780

Databases

Distributed Global Secondary Indexes

Querying by a field that is not the shard key.

6 min read · advanced · beat Platinum to climb

The Problem

In a sharded store, the shard key routes lookups efficiently. But applications often query by other fields, like email when data is sharded by user id. A secondary index lets you do that, and in a distributed system there are two ways to build one.

Local Secondary Index

Each shard indexes only its own rows. The index is partitioned by the shard key of the base data.

  • Writes stay local because a row and its index entry sit on the same shard.
  • Reads by the secondary field must scatter to every shard, since matching rows could be anywhere. This is also called a scatter gather read.

Global Secondary Index

The index is partitioned by the indexed field itself, spread across shards independently of the base data.

  • Reads by that field hit only the shard owning that index range, so they are targeted and fast.
  • Writes become harder. A single base write may need to update an index entry on a different shard, splitting one logical write across nodes and raising consistency questions.

The Tradeoff

Local indexes make writes cheap and reads broad. Global indexes make reads targeted and writes distributed. Many systems keep global indexes eventually consistent, updating them asynchronously so the base write stays fast at the cost of a brief lag.

Key idea

Local secondary indexes keep writes local but scatter reads; global secondary indexes target reads but spread each write across shards, often updated asynchronously.

Check yourself

Answer to earn rating on the learn ladder.

1. How does a local secondary index serve a read by the indexed field?

2. Why are writes harder with a global secondary index?

3. How do many systems keep global index writes fast?