A Different Kind Of Query
Traditional databases find rows by exact match. A vector database stores high dimensional embeddings that represent meaning, and finds rows by similarity. Instead of asking for an exact value, you ask for the nearest neighbors to a query vector.
How Search Stays Fast
- Comparing a query against every vector is brute force and slow at scale.
- An approximate nearest neighbor index trades a little accuracy for huge speed.
- A popular structure is HNSW, a navigable graph that hops toward close vectors.
- Similarity uses a distance like cosine or dot product.
Where It Fits
- Powering semantic search over documents and images.
- Retrieval for language models in a retrieval augmented pipeline.
- Recommendations based on closeness in embedding space.
Key idea
A vector database indexes embeddings and answers nearest neighbor queries using approximate search like HNSW, enabling semantic similarity instead of exact match lookups.