Vector Search for Semantic Retrieval

Matching meaning, not words

Lexical search misses a result that means the same thing with different words. Vector search embeds both the query and each document into a high dimensional vector with a model, so semantically similar items sit close together. Retrieval becomes finding the nearest neighbors of the query vector.

Why approximate search

Exact nearest neighbor over millions of vectors means comparing the query to every vector, which is too slow. Engines use approximate nearest neighbor indexes that trade a little accuracy for huge speed:

Graph indexes like hierarchical navigable small world link vectors and walk toward the query.
Quantization compresses vectors so more fit in memory and comparisons are cheaper.

The key knob is recall versus latency, since approximation may miss a true neighbor.

What it does not give you

Vector search captures topical similarity but can be weak on exact terms like a part number or a rare name, where lexical matching shines. It also depends entirely on the embedding model, so changing models means re embedding the corpus.

Key idea

Vector search embeds queries and documents into a shared space and uses approximate nearest neighbor indexes to retrieve semantically similar results, trading some recall for speed.

Vector Search for Semantic Retrieval

Matching meaning, not words

Why approximate search

What it does not give you

Key idea

Check yourself