What an SSTable Is
A sorted string table, often written SSTable, is an immutable on disk file holding key value pairs sorted by key. Because it is immutable, it is never edited in place. Updates and deletes are expressed as new entries in newer files.
Internal Layout
An SSTable is divided into pieces that work together.
- Data blocks hold the sorted key value pairs, usually a few kilobytes each so they fit one disk read.
- A block index maps the first key of each block to its file offset, so a lookup binary searches the index rather than the whole file.
- A bloom filter answers whether a key might be present before any block is read.
- A footer at the end points to the index and filter so a reader can bootstrap.
How a Lookup Works
The reader loads the footer, then the index, consults the bloom filter, binary searches the index to find the candidate block, reads that block, and scans within it. That is typically one or two disk reads.
Deletes as Tombstones
A delete writes a tombstone, a marker that hides older values for the key until compaction removes both.
Key idea
An SSTable is an immutable sorted file with data blocks, a sparse index, a bloom filter, and a footer, letting the engine reach any key in a read or two.