← Lessons

quiz vs the machine

Platinum1780

Machine Learning

Semantic Search Basics

Finding documents by meaning, not exact words.

5 min read · advanced · beat Platinum to climb

Semantic Search Basics

Traditional keyword search matches exact words. A query for cheap laptop misses a page about an affordable notebook, even though they mean the same thing. Semantic search fixes this by matching on meaning instead of surface text.

The trick is embeddings. A model maps both documents and queries into the same vector space where similar meanings sit close together. Searching becomes a nearest neighbor problem: embed the query, then find the document vectors with the highest cosine similarity.

A semantic search system has two phases:

  • Indexing, done ahead of time, where every document is embedded and stored
  • Querying, done live, where the query is embedded and compared to the index

Scanning every vector is too slow at scale, so systems use approximate nearest neighbor search. These index structures trade a tiny bit of accuracy for enormous speed, returning the top matches in milliseconds across millions of vectors.

Semantic search shines when wording varies but intent is shared, and it handles synonyms and paraphrases that keyword search drops. A common practical pattern is hybrid search, blending semantic similarity with keyword matching so exact terms like product codes still land precisely while meaning based recall fills the gaps.

Key idea

Semantic search embeds queries and documents into one vector space and retrieves by nearest neighbor, matching meaning rather than exact words.

Check yourself

Answer to earn rating on the learn ladder.

1. How does semantic search differ from keyword search?

2. Why use approximate nearest neighbor search?

3. What does hybrid search combine?