More than an index
An ANN index finds nearby vectors, but a vector database wraps that index in the machinery needed for real use. It must ingest vectors, persist them, serve queries, and stay consistent as data changes.
The core components
- Ingestion: vectors and their metadata arrive and are written to storage.
- Index builder: constructs and updates the ANN structure such as HNSW or IVF.
- Query engine: embeds or accepts a query vector, searches the index, and applies filters.
- Storage: keeps both the raw vectors and the index, often with persistence to disk.
Operational concerns
- Updates: graph indexes can be costly to modify, so many systems batch or rebuild.
- Sharding: large collections are split across nodes, and each shard is searched in parallel.
- Consistency: newly added vectors must become searchable within an acceptable delay.
Why this matters
Choosing a vector database is not only about ANN speed. Filtering, updates, scaling, and durability often decide whether it fits production.
Key idea
A vector database surrounds an ANN index with ingestion, storage, a query engine, and scaling logic, so production fit depends on far more than raw search speed.