← Lessons

quiz vs the machine

Silver1100

Databases

The Heap File Organization

A heap file stores rows in no particular order, so inserts are cheap but every lookup needs an index or a full scan.

4 min read · intro · beat Silver to climb

What a Heap File Is

A heap file is the simplest way to lay out a table on disk: rows are stored in no particular order, placed wherever free space exists. A new row goes into any page with room, often the most recently used one.

How Rows Are Found

Because the order is arbitrary, a row is located by its physical address, often a page number plus a slot, called a row identifier or tuple id. Secondary indexes point at these identifiers.

  • A lookup by indexed column reads the index, gets the row identifier, then fetches the page.
  • A query with no useful index must perform a full table scan, reading every page.

Heap Versus Clustered

The main alternative is a clustered organization, where rows are physically sorted by the primary key, as in a B tree index organized table.

  • Heaps make inserts cheap, since a row drops into any free spot.
  • Clustered tables make range scans on the key cheap, since adjacent keys sit on adjacent pages.
  • Heaps can leave gaps after deletes, needing periodic vacuum or reorganization.

Where Heaps Fit

Heaps suit insert heavy tables and staging areas where access is mostly through secondary indexes rather than ordered scans of the primary key.

Key idea

A heap file stores rows unordered for cheap inserts, relying on indexes or full scans to find rows, unlike a clustered table sorted by its key.

Check yourself

Answer to earn rating on the learn ladder.

1. How are rows arranged in a heap file?

2. What does a query with no useful index require on a heap?