Two Ways To Store A Table
Where do the actual rows live? Two layouts dominate. A heap file stores rows in no particular order, wherever there is free space. A clustered index stores the rows themselves inside the leaves of a btree ordered by the index key.
Heap Files
- Rows are appended where space allows, so inserts are cheap.
- Every secondary index points to a row by its physical location.
- A full table scan reads pages in arbitrary order.
Clustered Indexes
- The table is the index. Leaf pages hold the full rows sorted by the clustering key.
- Range scans on that key are fast because matching rows sit together.
- Secondary indexes point to the clustering key, so a lookup costs an extra btree traversal.
PostgreSQL uses heaps by default while InnoDB always clusters on the primary key. Choosing a clustering key shapes which queries stay cheap.
Key idea
A heap stores rows unordered for cheap inserts, while a clustered index sorts rows by a key inside the btree, trading insert cost for fast ordered range access.