Flipping The Mapping
A document store naturally maps a document to the words it contains. Search needs the opposite: given a word, which documents contain it. An inverted index provides exactly this, mapping each distinct term to a list of documents that hold it.
Anatomy
- The dictionary is the sorted set of all distinct terms.
- Each term points to a postings list of document identifiers, often with positions and frequencies.
- Storing positions enables phrase queries, matching terms that appear adjacent and in order.
Answering Queries
For a multi word query, the engine fetches each term postings list and combines them.
- An AND query intersects the lists, keeping documents present in all.
- An OR query unions them.
Because postings lists are sorted by document identifier, intersection and union are fast merge operations. Term frequency and document rarity then drive a relevance score that orders the results.
Key idea
An inverted index maps each term to a sorted postings list of documents, turning keyword search into fast merges over those lists followed by relevance ranking.