The core idea
A naive search reads every document and checks for the query word. That is a forward scan and it does not scale. An inverted index flips the relationship: instead of asking which words a document holds, it stores, for each word, the list of documents that contain it.
That per word list is called a postings list. Each entry usually holds a document id and often the positions where the word appears, which lets the engine answer phrase queries.
Why it is fast
- The vocabulary of unique terms is small compared to total text.
- A lookup jumps straight to one term and reads only its postings.
- Multi word queries intersect postings lists rather than rescanning text.
What it costs
- The index must be built ahead of time and updated as documents change.
- It uses extra storage, though postings compress well because ids are sorted.
The inverted index is the foundation under nearly every full text search engine, from a small library to a global web index.
Key idea
An inverted index maps each term to the documents that contain it so queries read only relevant postings instead of scanning all text.