← Lessons

quiz vs the machine

Gold1390

Databases

Full Text Search

Match words and phrases in documents, not just exact strings.

4 min read · core · beat Gold to climb

Beyond LIKE

A LIKE query with wildcards scans text and cannot rank results or understand word forms. Full text search treats text as a bag of words so you can find documents that contain given terms and order them by relevance.

How It Works

  • Text is tokenized into individual words.
  • A stemmer reduces words to a root, so running and runs both match run.
  • Common stop words like the and is are dropped to save space.
  • The result is stored in an inverted index that maps each term to the documents holding it.

Ranking

Search engines score documents so the best matches appear first. A common scheme weighs how often a term appears in a document against how rare the term is across all documents.

Tradeoffs

Full text indexes add write cost and storage, and stemming choices affect what matches. They shine for human language but are overkill for exact code or identifier lookups.

Key idea

Full text search tokenizes and stems text into an inverted index, enabling relevance ranked word and phrase matching that plain LIKE cannot offer.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a stemmer do during indexing?

2. What data structure powers fast full text lookups?