← Lessons

quiz vs the machine

Gold1420

Databases

Inverted Index For Search

The term to document mapping that makes keyword search scalable.

5 min read · core · beat Gold to climb

Flipping The Mapping

A document store naturally maps a document to the words it contains. Search needs the opposite: given a word, which documents contain it. An inverted index provides exactly this, mapping each distinct term to a list of documents that hold it.

Anatomy

  • The dictionary is the sorted set of all distinct terms.
  • Each term points to a postings list of document identifiers, often with positions and frequencies.
  • Storing positions enables phrase queries, matching terms that appear adjacent and in order.

Answering Queries

For a multi word query, the engine fetches each term postings list and combines them.

  • An AND query intersects the lists, keeping documents present in all.
  • An OR query unions them.

Because postings lists are sorted by document identifier, intersection and union are fast merge operations. Term frequency and document rarity then drive a relevance score that orders the results.

Key idea

An inverted index maps each term to a sorted postings list of documents, turning keyword search into fast merges over those lists followed by relevance ranking.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a postings list contain?

2. Why are postings lists kept sorted by document identifier?

3. What extra data enables phrase queries?