← Lessons

quiz vs the machine

Silver1130

Databases

Heap Files vs Clustered Indexes

Table rows can live in an unordered heap or be physically sorted by a key.

4 min read · intro · beat Silver to climb

Two Ways To Store A Table

Where do the actual rows live? Two layouts dominate. A heap file stores rows in no particular order, wherever there is free space. A clustered index stores the rows themselves inside the leaves of a btree ordered by the index key.

Heap Files

  • Rows are appended where space allows, so inserts are cheap.
  • Every secondary index points to a row by its physical location.
  • A full table scan reads pages in arbitrary order.

Clustered Indexes

  • The table is the index. Leaf pages hold the full rows sorted by the clustering key.
  • Range scans on that key are fast because matching rows sit together.
  • Secondary indexes point to the clustering key, so a lookup costs an extra btree traversal.

PostgreSQL uses heaps by default while InnoDB always clusters on the primary key. Choosing a clustering key shapes which queries stay cheap.

Key idea

A heap stores rows unordered for cheap inserts, while a clustered index sorts rows by a key inside the btree, trading insert cost for fast ordered range access.

Check yourself

Answer to earn rating on the learn ladder.

1. In a clustered index, where do the full rows live?

2. Why are range scans on the clustering key fast?